![Shriyansh Agrawal, Developer in New Delhi, India](http://assets.toptal.io/images?url=http%3A%2F%2Fbs-uploads.toptal.io%2Fblackfish-uploads%2Ftalent%2Fprofile%2Fpicture_file%2Fpicture%2F1064487%2Fhuge_17732855025aea83b54e48ff2fe47d04-86022871dbbc78825e55070efedf704a.png&width=524)
Shriyansh Agrawal
Verified Expert in Engineering
Artificial Intelligence (AI) Developer
Shriyansh is a developer who loves to design solutions for real-world challenges through the knowledge of computers and for the greater good. In the past, he has worked remotely with many open source communities and companies and published his work globally at international conferences and meetups. Some of Shriyansh’s work has been awarded great prestige.
Portfolio
Experience
Availability
Preferred Environment
PyCharm, Python, Big Data, Amazon Web Services (AWS), ETL, Artificial Intelligence (AI), Business Intelligence (BI), Data Engineering, Machine Learning, Data Science, REST APIs, Machine Learning Operations (MLOps), Data Pipelines, Full-stack, Software Development
The most amazing...
...work I've done is a US housing market AI prediction model with competitive accuracy which generated a big impact in the market.
Work Experience
Data Scientist and Data Engineer via Toptal
Cal.net, Inc.
- Developed big data ETL pipelines to ingest versioned geographical data of California from multiple providers into a data lake. Currently, this lake consists of 28 million location rows, each with 100 feature columns.
- Designed an algorithmic approach to deduplicate locations, ingested from various location providers, and assigned a unique deserializable hash ID for analysts to view a single set of unique locations across California.
- Participated in submitting various regulatory ISP requirements and compliances, saving the company millions of dollars within a constrained timeframe and objective.
- Helped the company analyze potential geographies for business expansion provisions, alongside winning grants from regulators of broadband connections.
Business Intelligence Expert
Reward Gateway
- Developed a BI dashboard on Sisense using data from a Big Data cluster served via MariaDB.
- Built a dashboard that serves real-time and historical analytics with visual graphs and interactive filters.
- Profiled older SQL queries and increased their performance by 12 times using best industry practices.
- Tracked issues via Jira, handling 221 of them: 112 older tickets and 109 new ones assigned to me.
- Served two sub-clients, Deutsche Bank and Ericsson, and was able to deliver all asked use cases and functionalities.
Senior Data Scientist
Freelance
- Developed an Automated Valuation Model (AVM) to predict the sales price of homes in the US. Led the project since its inception, resulting in expanding the team from one to five members over time.
- Connected and traded with different real estate data providers to assess their datasets for quality and correctness and to check our models' applicability.
- Generated point-in-time sales price predictions for seven million houses in Ohio with competitive accuracy compared to marker leaders who spent millions of dollars to achieve the same result over the past few years.
- Worked on the explainability of this AI model by generating nearby similar houses with similar ranks to explain and convince end-users why their home is marked at a specific price band.
- Trained models to generate a range of sale prices with around 80% confidence. Used by users for further adjustment based on the walkthrough and house conditions.
- Generated historical housing trends based on individual geography for users to understand the market. Continued an ongoing project for market trend forecasting to help online buyers with unforeseen investment ROI.
- Initiated steps for increasing predictiveness to include house photo conditions and use them in our AI model for conditional adjustments.
- Built market trend insights on Amazon Quicksight and Mode.com. These trends have added geographical segregation with restrictive access control and are built so that project managers or any non-techy clients can iterate on their own.
Senior Machine Learning Engineer
Freelance
- Aided trade automation requirements in a high-frequency trading firm in India. Included big data processing and data science on in-house server infrastructure.
- Developed a model to find similar stocks in varied trade markets. Applied profit-making strategies for similar collateral.
- Developed an in-house CI/CD pipeline to deploy strategies from commit hash across the firm's in-house trading servers. Regarded as some of my best work, as it solves huge deployment problems, and now people forget the stress of production deployment.
- Communicated with multiple stakeholders from across the globe alongside trader teams to bring underlying data into a common format for daily reconciliation of individual trade attributes.
- Simplified the work of traders exponentially as they now focus on their strategy implementation and leave the rest to my services. Involved with heavy monitoring and manual checks to ensure zero fault tolerance.
- Deployed a live profit and loss dashboard upon daily data influx with access control. Deployed an ETL pipeline, API gateway, and PyPI shell command to make this system user interactable. Included automation on the cloud, log monitoring, and alerts.
Machine Learning Engineer
Fourkites
- Developed a Kafka-streamed, Spark ETL pipeline for big data processing in Hadoop clusters to produce AI prediction metrics over various geographies on Grafana.
- Designed a feature layer over an AWS data lake to ease infra consumption by individual ML models.
- Orchestrated automation and alerts using Airflow.
- Eased the coupling between training and production infra using TensorFlow Serving.
- Tested code in UAT and staging environment before pushing it for production.
- Performed exploratory data analysis (EDA) for business development and insights using pandas.
- Developed microservices with API integrations for seamless SaaS operations on the ROR platform.
- Built insights dashboards on Sisense and Grafana. These dashboards are built so that project managers or any non-techy client or client's end-users would find them easy to use. My major contribution was improving the SQL query time by 10X.
Open Source Contributor
Plone
- Developed a Plone add-on named collective.ifttt, which acts as a webhook integrated with IFTTT services to allow auto-exchange of information between platforms. If news were published on the site, it would automatically tweet about it.
- Awarded #1 Plone add-on of 2018 in the annual conference of Plone held in Tokyo for the collective.ifttt add-on.
- Developed another Plone add-on named plone.importexport, which deserializes Zope data into human-readable FileSystem format to assist non-techy users with CRUD operations on data like import and export.
- Oversaw upgrade of the plone.importexport add-on in the following years to serve it as core servings of Plone, a Python CMS platform.
Software Developer
FairShuffle
- Designed a client-side graphics rendering framework for online card games with real-time updates.
- Performed Async rendering of various layers without any observable delay. The primary challenges were the ability to adhere to multiple themes based on the configuration provided by Adobe PhotoShop.
- Multiple wow effects were also highlighted in this game to attract user attention.
Experience
AVM for the US Housing Market
Chief elements of this project are:
• To connect with different real estate data providers to assess datasets for quality and correctness along with their applicability in our models
• Since a startup funded this project, we have to ensure high productive output of the provided money
• To achieve competitive accuracy with market leaders, who have spent millions over the same issues.
• To explain predicted prices to end-users by finding similar recent sales in the nearby region from big data.
• Market trends data generation alongside price forecasting to help investors realize their ROI.
• To produce the historical price of all houses in certain states to demonstrate market trends.
• To generate all these predictions in real-time, where we are dealing with over ten million houses in a single state.
PnL Dashboard For Algorithmic Trading Firm
Real-time stats were served using Kafka integrations, and historical charts were generated using the ETL pipeline, where data was extracted from AWS cloud storage.
These dashboards are integrated with trading systems, where strategies and users can interact via CLI and API.
Feature Layer over Data Lake
http://www.fourkites.com/I designed a Feature Layer over the firm's data lake, where requirements of all ML models are examined to serve a common Feature layer, thus easing infra consumption of the firm by 14x.
The challenge was to bring harmony among features, ensuring that there was no hampering of any running ML service in production.
Skills
Languages
Python, Bash Script, Python 3, SQL, JavaScript, R, HTML, CSS, HTML5, Ruby, Scala, Pine Script, C++, C Shell, Java
Libraries/APIs
REST APIs, Interactive Brokers API, Plotly.js, GitHub API, Pandas, Scikit-learn, Node.js, PyTorch, Social Media APIs, Amazon API, Twilio API, Google Maps API, D3.js, PySpark, TensorFlow Deep Learning Library (TFLearn), NumPy, Asyncio, Fabric, React, TensorFlow
Tools
PyCharm, Amazon SageMaker, GitLab, Plotly, GitHub, Git, Slack, Grafana, Amazon QuickSight, AWS Glue, Tableau, Atlassian, BigQuery, DataGrip, Spotfire, GIS, Ansible, Terraform, Google Analytics, Apache Airflow, Sisense, AWS CLI, GitLab CI/CD, Jenkins, PyPI, SAS Business Intelligence (BI), Kibana
Paradigms
ETL, Data Science, REST, Good Clinical Practice (GCP), ETL Implementation & Design, B2B, Automation, Business Intelligence (BI), Acceptance Testing
Platforms
MacOS, Linux, Apache Kafka, Amazon Web Services (AWS), Jupyter Notebook, Kubernetes, Azure, AWS Lambda, Amazon, Google Cloud Platform (GCP), IFTTT, Blockchain, Jet Admin
Storage
PostgreSQL, PostgreSQL 10, Amazon S3 (AWS S3), Data Pipelines, MySQL, Elasticsearch, Redshift, Data Lakes, Data Lake Design, AWS Data Pipeline Service, NoSQL, Amazon DynamoDB, Apache Hive, Google Cloud, Database Administration (DBA), MongoDB, MariaDB, Neo4j
Other
Telegram Bots, Big Data, Data Engineering, APIs, Data Reporting, Data Analytics, ETL Development, Data Architecture, Data Analysis, Datasets, Data Collection, Machine Learning Operations (MLOps), Real-World Evidence, API Integration, Automation Scripting, Large-scale Projects, Analytical Dashboards, Software Development, Data Processing, Multithreading, Software Architecture, Data Entry, Automated Data Flows, Business Analysis, Analytics, Data Manipulation, Reports, Artificial Intelligence (AI), Business Solutions, Machine Learning, Web Scraping, Data Modeling, Data Visualization, Serverless, Data Warehousing, Data Warehouse Design, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Time Series Analysis, Forecasting, Amazon Machine Learning, Statistical Analysis, Model Development, Classification Algorithms, Mobile Games, Predictive Modeling, Statistics, Dashboards, Computer Vision, Financial Forecasting, Marketing Attribution, Business to Business (B2B), Full-stack, Data Scraping, Geographic Information Systems, GeoPandas, Geospatial Data, Real-time Data, Data Management, Data Governance, Architecture, Algorithms, Pricing Models, Data-driven Marketing, Back-end, Expert Systems, Decision Trees, Decision Modeling, Data-driven Decision-making, ChatGPT, Chatbots, User Interface (UI), eCommerce APIs, Extensions, eCommerce, GPU Computing, Web Development, Scraping, Amazon Marketplace, Endpoint Security, Minimum Viable Product (MVP), Time Series, Reinforcement Learning, Integration, Algorithmic Trading, Frameworks, Quotations, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Leadership, System Design, Digital Solutions, Marketing Technology (MarTech), B2C Marketing, Client Success, EDA, CI/CD Pipelines, Servers, Data Aggregation, Data Recovery, HubSpot, Valuation, Webhooks, PEP 8, Cloud Data Fusion, Natural Language Processing (NLP), Real-time Business Intelligence, Grafana 2, Google BigQuery, Deep Learning, Language Models, Reporting, BI Reports, BI Reporting, Marketing Mix Modeling, GPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, OpenAI GPT-4 API, QGIS, Google Earth
Frameworks
Spark, Django, Hadoop, Ruby on Rails 4, Sphinx Documentation Generator, Plone
Industry Expertise
Trading Systems
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring