Shriyansh Agrawal, Developer in New Delhi, India
Shriyansh is available for hire
Hire Shriyansh

Shriyansh Agrawal

Verified Expert  in Engineering

Artificial Intelligence (AI) Developer

Location
New Delhi, India
Toptal Member Since
March 31, 2022

Shriyansh is a developer who loves to design solutions for real-world challenges through the knowledge of computers and for the greater good. In the past, he has worked remotely with many open source communities and companies and published his work globally at international conferences and meetups. Some of Shriyansh’s work has been awarded great prestige.

Portfolio

Cal.net, Inc.
Data Engineering, Python, GIS, Data Pipelines, QGIS, ETL, Google Earth
Reward Gateway
Business Intelligence (BI), Sisense, Reporting, BI Reports, BI Reporting...
Freelance
Python, Big Data, Artificial Intelligence (AI), ETL, PyCharm, Data Science...

Experience

Availability

Full-time

Preferred Environment

PyCharm, Python, Big Data, Amazon Web Services (AWS), ETL, Artificial Intelligence (AI), Business Intelligence (BI), Data Engineering, Machine Learning, Data Science, REST APIs, Machine Learning Operations (MLOps), Data Pipelines, Full-stack, Software Development

The most amazing...

...work I've done is a US housing market AI prediction model with competitive accuracy which generated a big impact in the market.

Work Experience

Data Scientist and Data Engineer via Toptal

2023 - PRESENT
Cal.net, Inc.
  • Developed big data ETL pipelines to ingest versioned geographical data of California from multiple providers into a data lake. Currently, this lake consists of 28 million location rows, each with 100 feature columns.
  • Designed an algorithmic approach to deduplicate locations, ingested from various location providers, and assigned a unique deserializable hash ID for analysts to view a single set of unique locations across California.
  • Participated in submitting various regulatory ISP requirements and compliances, saving the company millions of dollars within a constrained timeframe and objective.
  • Helped the company analyze potential geographies for business expansion provisions, alongside winning grants from regulators of broadband connections.
Technologies: Data Engineering, Python, GIS, Data Pipelines, QGIS, ETL, Google Earth

Business Intelligence Expert

2022 - 2023
Reward Gateway
  • Developed a BI dashboard on Sisense using data from a Big Data cluster served via MariaDB.
  • Built a dashboard that serves real-time and historical analytics with visual graphs and interactive filters.
  • Profiled older SQL queries and increased their performance by 12 times using best industry practices.
  • Tracked issues via Jira, handling 221 of them: 112 older tickets and 109 new ones assigned to me.
  • Served two sub-clients, Deutsche Bank and Ericsson, and was able to deliver all asked use cases and functionalities.
Technologies: Business Intelligence (BI), Sisense, Reporting, BI Reports, BI Reporting, MariaDB, Big Data, SQL, Datasets, Data Collection, Data Pipelines, Marketing Attribution, Business to Business (B2B), Marketing Mix Modeling, B2B, Full-stack, NoSQL, Real-World Evidence, Real-time Data, Data Management, Data Governance, Architecture, MySQL, Algorithms, Pricing Models, Data-driven Marketing, API Integration, Social Media APIs, Back-end, AWS Lambda, HTML, CSS, HTML5, Expert Systems, ChatGPT, Amazon API, Extensions, Automation Scripting, Large-scale Projects, eCommerce, Analytical Dashboards, Twilio API, Software Development, GPU Computing, Data Processing, Web Development, Amazon, Scraping, Amazon Marketplace, Endpoint Security, Google Maps API, Minimum Viable Product (MVP), Plotly, Plotly.js, GitHub, GitHub API, Software Architecture, Google Cloud Platform (GCP), Data Entry, Automated Data Flows, Time Series, Business Analysis, Analytics, Data Manipulation, Reports, Quotations, Generative Artificial Intelligence (GenAI), Leadership, Git, System Design, OpenAI GPT-3 API, OpenAI GPT-4 API

Senior Data Scientist

2020 - 2022
Freelance
  • Developed an Automated Valuation Model (AVM) to predict the sales price of homes in the US. Led the project since its inception, resulting in expanding the team from one to five members over time.
  • Connected and traded with different real estate data providers to assess their datasets for quality and correctness and to check our models' applicability.
  • Generated point-in-time sales price predictions for seven million houses in Ohio with competitive accuracy compared to marker leaders who spent millions of dollars to achieve the same result over the past few years.
  • Worked on the explainability of this AI model by generating nearby similar houses with similar ranks to explain and convince end-users why their home is marked at a specific price band.
  • Trained models to generate a range of sale prices with around 80% confidence. Used by users for further adjustment based on the walkthrough and house conditions.
  • Generated historical housing trends based on individual geography for users to understand the market. Continued an ongoing project for market trend forecasting to help online buyers with unforeseen investment ROI.
  • Initiated steps for increasing predictiveness to include house photo conditions and use them in our AI model for conditional adjustments.
  • Built market trend insights on Amazon Quicksight and Mode.com. These trends have added geographical segregation with restrictive access control and are built so that project managers or any non-techy clients can iterate on their own.
Technologies: Python, Big Data, Artificial Intelligence (AI), ETL, PyCharm, Data Science, Google Analytics, Automation, Google Cloud, HubSpot, Pandas, NumPy, Scikit-learn, Amazon SageMaker, Data Engineering, PostgreSQL, Database Administration (DBA), Asyncio, GitLab CI/CD, AWS CLI, Valuation, MacOS, Telegram Bots, Python 3, SQL, PostgreSQL 10, Business Intelligence (BI), APIs, REST, Web Scraping, Good Clinical Practice (GCP), Amazon QuickSight, Amazon Web Services (AWS), Amazon S3 (AWS S3), Machine Learning, Redshift, Data Reporting, Data Analytics, AWS Glue, Data Modeling, Data Visualization, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, SAS Business Intelligence (BI), Real-time Business Intelligence, Node.js, Tableau, GitLab, Atlassian, Data Lakes, Data Warehousing, Data Warehouse Design, Data Lake Design, AWS Data Pipeline Service, Jupyter Notebook, ETL Implementation & Design, ETL Development, Data Architecture, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Data Analysis, Google BigQuery, Time Series Analysis, Forecasting, Amazon Machine Learning, DataGrip, Statistical Analysis, Model Development, Deep Learning, PyTorch, Classification Algorithms, Spotfire, Predictive Modeling, Statistics, REST APIs, Language Models, Dashboards, Kibana, Datasets, Data Collection, Computer Vision, R, Machine Learning Operations (MLOps), Data Pipelines, Financial Forecasting, Marketing Attribution, Business to Business (B2B), Marketing Mix Modeling, B2B, Full-stack, NoSQL, Data Scraping, Real-World Evidence, Kubernetes, GIS, Geographic Information Systems, GeoPandas, Geospatial Data, Azure, Real-time Data, Data Management, Data Governance, Architecture, MySQL, Algorithms, Pricing Models, Data-driven Marketing, API Integration, Social Media APIs, Back-end, AWS Lambda, HTML, CSS, HTML5, Expert Systems, Decision Trees, Decision Modeling, Data-driven Decision-making, User Interface (UI), Amazon API, eCommerce APIs, Extensions, Automation Scripting, Large-scale Projects, eCommerce, Analytical Dashboards, Twilio API, Software Development, GPU Computing, Data Processing, Amazon, Scraping, Amazon Marketplace, Multithreading, Ansible, Endpoint Security, Minimum Viable Product (MVP), Plotly, Plotly.js, GitHub, GitHub API, Software Architecture, Google Cloud Platform (GCP), Data Entry, Automated Data Flows, Time Series, Reinforcement Learning, Integration, Business Analysis, Algorithmic Trading, Pine Script, Analytics, Data Manipulation, Frameworks, Amazon DynamoDB, Terraform, Reports, Quotations, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Leadership, Git, System Design

Senior Machine Learning Engineer

2019 - 2021
Freelance
  • Aided trade automation requirements in a high-frequency trading firm in India. Included big data processing and data science on in-house server infrastructure.
  • Developed a model to find similar stocks in varied trade markets. Applied profit-making strategies for similar collateral.
  • Developed an in-house CI/CD pipeline to deploy strategies from commit hash across the firm's in-house trading servers. Regarded as some of my best work, as it solves huge deployment problems, and now people forget the stress of production deployment.
  • Communicated with multiple stakeholders from across the globe alongside trader teams to bring underlying data into a common format for daily reconciliation of individual trade attributes.
  • Simplified the work of traders exponentially as they now focus on their strategy implementation and leave the rest to my services. Involved with heavy monitoring and manual checks to ensure zero fault tolerance.
  • Deployed a live profit and loss dashboard upon daily data influx with access control. Deployed an ETL pipeline, API gateway, and PyPI shell command to make this system user interactable. Included automation on the cloud, log monitoring, and alerts.
Technologies: Python, Big Data, ETL, Automation, Data Engineering, Machine Learning, Data Science, Artificial Intelligence (AI), Marketing Technology (MarTech), Bash Script, C++, Apache Kafka, PySpark, Pandas, NumPy, APIs, Asyncio, CI/CD Pipelines, Servers, Data Aggregation, Data Recovery, AWS CLI, C Shell, PostgreSQL, Business Solutions, Linux, MacOS, Telegram Bots, Python 3, SQL, PostgreSQL 10, Elasticsearch, REST, Web Scraping, Good Clinical Practice (GCP), Amazon Web Services (AWS), Amazon S3 (AWS S3), Data Reporting, Data Analytics, Data Modeling, Data Visualization, Serverless, JavaScript, GitLab, Atlassian, BigQuery, Data Warehousing, Data Warehouse Design, AWS Data Pipeline Service, Jupyter Notebook, ETL Implementation & Design, ETL Development, Data Architecture, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Data Analysis, Time Series Analysis, Forecasting, Amazon SageMaker, DataGrip, Statistical Analysis, Model Development, PyTorch, Classification Algorithms, Predictive Modeling, Statistics, REST APIs, Scikit-learn, Language Models, Dashboards, Datasets, Data Collection, Machine Learning Operations (MLOps), Data Pipelines, Financial Forecasting, Business to Business (B2B), B2B, Full-stack, Data Scraping, Real-World Evidence, GIS, Geographic Information Systems, Geospatial Data, Azure, Real-time Data, Data Management, Data Governance, Architecture, MySQL, Algorithms, Pricing Models, Data-driven Marketing, API Integration, Social Media APIs, Back-end, HTML, CSS, HTML5, Decision Trees, Decision Modeling, Data-driven Decision-making, User Interface (UI), Amazon API, eCommerce APIs, Extensions, Automation Scripting, Large-scale Projects, Analytical Dashboards, Software Development, GPU Computing, Data Processing, Amazon, Scraping, Amazon Marketplace, Interactive Brokers API, Multithreading, Trading Systems, Ansible, Endpoint Security, Google Maps API, Plotly, Plotly.js, GitHub, GitHub API, D3.js, Software Architecture, Google Cloud Platform (GCP), Scala, Java, Data Entry, Automated Data Flows, Time Series, Reinforcement Learning, Integration, Business Analysis, R, Analytics, Data Manipulation, Frameworks, Amazon DynamoDB, Reports, Leadership, Git, System Design, OpenAI GPT-3 API

Machine Learning Engineer

2018 - 2019
Fourkites
  • Developed a Kafka-streamed, Spark ETL pipeline for big data processing in Hadoop clusters to produce AI prediction metrics over various geographies on Grafana.
  • Designed a feature layer over an AWS data lake to ease infra consumption by individual ML models.
  • Orchestrated automation and alerts using Airflow.
  • Eased the coupling between training and production infra using TensorFlow Serving.
  • Tested code in UAT and staging environment before pushing it for production.
  • Performed exploratory data analysis (EDA) for business development and insights using pandas.
  • Developed microservices with API integrations for seamless SaaS operations on the ROR platform.
  • Built insights dashboards on Sisense and Grafana. These dashboards are built so that project managers or any non-techy client or client's end-users would find them easy to use. My major contribution was improving the SQL query time by 10X.
Technologies: Artificial Intelligence (AI), Automation, Big Data, ETL, Apache Airflow, Apache Kafka, PySpark, Python, Sisense, Grafana, B2C Marketing, Client Success, Hadoop, Apache Hive, Kubernetes, TensorFlow Deep Learning Library (TFLearn), Pandas, NumPy, Ruby on Rails 4, PostgreSQL, EDA, MacOS, Slack, Digital Solutions, Python 3, SQL, PostgreSQL 10, Elasticsearch, TensorFlow, Spark, Business Intelligence (BI), APIs, REST, Web Scraping, Good Clinical Practice (GCP), Amazon QuickSight, Amazon Web Services (AWS), Amazon S3 (AWS S3), Machine Learning, Redshift, Data Reporting, Data Analytics, AWS Glue, Data Modeling, Data Visualization, Grafana 2, SAS Business Intelligence (BI), JavaScript, Tableau, Data Science, GitLab, Atlassian, BigQuery, Data Lakes, Data Warehousing, Data Warehouse Design, Data Lake Design, AWS Data Pipeline Service, Jupyter Notebook, ETL Implementation & Design, ETL Development, Data Architecture, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Data Analysis, Forecasting, Amazon Machine Learning, Statistical Analysis, Spotfire, Predictive Modeling, REST APIs, Scikit-learn, Language Models, Dashboards, Datasets, Data Collection, Machine Learning Operations (MLOps), Data Pipelines, Full-stack, Data Scraping, Real-World Evidence, GIS, Geographic Information Systems, Geospatial Data, MySQL, Algorithms, API Integration, Back-end, Decision Trees, Decision Modeling, Data-driven Decision-making, Django, Automation Scripting, Large-scale Projects, Twilio API, Software Development, Data Processing, Web Development, Amazon, Scraping, Amazon Marketplace, Interactive Brokers API, Multithreading, Ruby, Plotly, Plotly.js, GitHub, GitHub API, D3.js, Software Architecture, Scala, Java, Data Entry, Automated Data Flows, Time Series, Reinforcement Learning, Integration, Business Analysis, Neo4j, Blockchain, Analytics, Data Manipulation, Frameworks, Git

Open Source Contributor

2017 - 2019
Plone
  • Developed a Plone add-on named collective.ifttt, which acts as a webhook integrated with IFTTT services to allow auto-exchange of information between platforms. If news were published on the site, it would automatically tweet about it.
  • Awarded #1 Plone add-on of 2018 in the annual conference of Plone held in Tokyo for the collective.ifttt add-on.
  • Developed another Plone add-on named plone.importexport, which deserializes Zope data into human-readable FileSystem format to assist non-techy users with CRUD operations on data like import and export.
  • Oversaw upgrade of the plone.importexport add-on in the following years to serve it as core servings of Plone, a Python CMS platform.
Technologies: Python, Webhooks, Acceptance Testing, CI/CD Pipelines, Sphinx Documentation Generator, Plone, IFTTT, Jenkins, PyPI, PEP 8, Python 3, APIs, REST, Good Clinical Practice (GCP), GitLab, Full-stack, Algorithms, Django, Scraping, Software Architecture, Neo4j, Git

Software Developer

2016 - 2018
FairShuffle
  • Designed a client-side graphics rendering framework for online card games with real-time updates.
  • Performed Async rendering of various layers without any observable delay. The primary challenges were the ability to adhere to multiple themes based on the configuration provided by Adobe PhotoShop.
  • Multiple wow effects were also highlighted in this game to attract user attention.
Technologies: Fabric, React, Node.js, MongoDB, Serverless, JavaScript, GitLab, Mobile Games, Algorithms, Django, Web Development, Amazon Marketplace, Ruby, Minimum Viable Product (MVP), Software Architecture, Neo4j, Blockchain

AVM for the US Housing Market

Developed a data-driven sales price prediction model for US homes. We used cutting-edge data-science techniques to achieve competitive accuracy with market leaders, who have spent millions of dollars to get the same information.

Chief elements of this project are:
• To connect with different real estate data providers to assess datasets for quality and correctness along with their applicability in our models
• Since a startup funded this project, we have to ensure high productive output of the provided money
• To achieve competitive accuracy with market leaders, who have spent millions over the same issues.
• To explain predicted prices to end-users by finding similar recent sales in the nearby region from big data.
• Market trends data generation alongside price forecasting to help investors realize their ROI.
• To produce the historical price of all houses in certain states to demonstrate market trends.
• To generate all these predictions in real-time, where we are dealing with over ten million houses in a single state.

PnL Dashboard For Algorithmic Trading Firm

Developed a PnL Dashboard over Grafana serving real-time and historical statistics with complex aggregations and filters mechanism.
Real-time stats were served using Kafka integrations, and historical charts were generated using the ETL pipeline, where data was extracted from AWS cloud storage.
These dashboards are integrated with trading systems, where strategies and users can interact via CLI and API.

Feature Layer over Data Lake

http://www.fourkites.com/
Individual ML models have feature transformation requirements that incur high server infrastructure costs.
I designed a Feature Layer over the firm's data lake, where requirements of all ML models are examined to serve a common Feature layer, thus easing infra consumption of the firm by 14x.
The challenge was to bring harmony among features, ensuring that there was no hampering of any running ML service in production.

Languages

Python, Bash Script, Python 3, SQL, JavaScript, R, HTML, CSS, HTML5, Ruby, Scala, Pine Script, C++, C Shell, Java

Libraries/APIs

REST APIs, Interactive Brokers API, Plotly.js, GitHub API, Pandas, Scikit-learn, Node.js, PyTorch, Social Media APIs, Amazon API, Twilio API, Google Maps API, D3.js, PySpark, TensorFlow Deep Learning Library (TFLearn), NumPy, Asyncio, Fabric, React, TensorFlow

Tools

PyCharm, Amazon SageMaker, GitLab, Plotly, GitHub, Git, Slack, Grafana, Amazon QuickSight, AWS Glue, Tableau, Atlassian, BigQuery, DataGrip, Spotfire, GIS, Ansible, Terraform, Google Analytics, Apache Airflow, Sisense, AWS CLI, GitLab CI/CD, Jenkins, PyPI, SAS Business Intelligence (BI), Kibana

Paradigms

ETL, Data Science, REST, Good Clinical Practice (GCP), ETL Implementation & Design, B2B, Automation, Business Intelligence (BI), Acceptance Testing

Platforms

MacOS, Linux, Apache Kafka, Amazon Web Services (AWS), Jupyter Notebook, Kubernetes, Azure, AWS Lambda, Amazon, Google Cloud Platform (GCP), IFTTT, Blockchain, Jet Admin

Storage

PostgreSQL, PostgreSQL 10, Amazon S3 (AWS S3), Data Pipelines, MySQL, Elasticsearch, Redshift, Data Lakes, Data Lake Design, AWS Data Pipeline Service, NoSQL, Amazon DynamoDB, Apache Hive, Google Cloud, Database Administration (DBA), MongoDB, MariaDB, Neo4j

Other

Telegram Bots, Big Data, Data Engineering, APIs, Data Reporting, Data Analytics, ETL Development, Data Architecture, Data Analysis, Datasets, Data Collection, Machine Learning Operations (MLOps), Real-World Evidence, API Integration, Automation Scripting, Large-scale Projects, Analytical Dashboards, Software Development, Data Processing, Multithreading, Software Architecture, Data Entry, Automated Data Flows, Business Analysis, Analytics, Data Manipulation, Reports, Artificial Intelligence (AI), Business Solutions, Machine Learning, Web Scraping, Data Modeling, Data Visualization, Serverless, Data Warehousing, Data Warehouse Design, Data Quality, Data Cleaning, Data Matching, Cloud Architecture, Time Series Analysis, Forecasting, Amazon Machine Learning, Statistical Analysis, Model Development, Classification Algorithms, Mobile Games, Predictive Modeling, Statistics, Dashboards, Computer Vision, Financial Forecasting, Marketing Attribution, Business to Business (B2B), Full-stack, Data Scraping, Geographic Information Systems, GeoPandas, Geospatial Data, Real-time Data, Data Management, Data Governance, Architecture, Algorithms, Pricing Models, Data-driven Marketing, Back-end, Expert Systems, Decision Trees, Decision Modeling, Data-driven Decision-making, ChatGPT, Chatbots, User Interface (UI), eCommerce APIs, Extensions, eCommerce, GPU Computing, Web Development, Scraping, Amazon Marketplace, Endpoint Security, Minimum Viable Product (MVP), Time Series, Reinforcement Learning, Integration, Algorithmic Trading, Frameworks, Quotations, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Leadership, System Design, Digital Solutions, Marketing Technology (MarTech), B2C Marketing, Client Success, EDA, CI/CD Pipelines, Servers, Data Aggregation, Data Recovery, HubSpot, Valuation, Webhooks, PEP 8, Cloud Data Fusion, Natural Language Processing (NLP), Real-time Business Intelligence, Grafana 2, Google BigQuery, Deep Learning, Language Models, Reporting, BI Reports, BI Reporting, Marketing Mix Modeling, GPT, Generative Pre-trained Transformers (GPT), OpenAI GPT-3 API, OpenAI GPT-4 API, QGIS, Google Earth

Frameworks

Spark, Django, Hadoop, Ruby on Rails 4, Sphinx Documentation Generator, Plone

Industry Expertise

Trading Systems

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring