⢠Worked on construction and maintenance of ETL and reporting data pipelines using Spark, Scala, and Python to ingest data from MongoDB, PostgreSQL, Logstash, Kafka, etc, and store it in the cloud data warehouse in Parquet format.
⢠Implemented and maintained AWS Deequ to monitor various metrics across data sets within the data warehouse, allowing us to quickly pick up on issues related to data quality.
⢠Created and maintained Hive tables with Presto to allow fast aggregation and querying of data. Such aggregations and queries are mostly used to create dashboards in Apache Superset, which in return help Swipejobs make key business decisions about the product.
⢠Designed various dashboards using Apache Superset, to allow business users to access information to make decisions. Managed security within the platform to restrict access to certain data sources.
⢠Developed and maintained Springboot-based microservice to deliver reporting to users.
⢠Development of Scala and Python scripts to update production databases, perform business-critical functionality, import, and export data to external systems.
⢠Developed Scala and Python scripts to update production collections in MongoDB.
⢠Worked as an NLP Data Science intern with The Future Society on the project called Project AIMS (AI against Modern Slavery).
⢠This project workes with other foundations like WIKIRate and BHRRC (Business and Human Rights Resource Centre) to use modern technologies like AI and Data Science to try to solve major social issues like Modern Slavery and Human Rights Violation.
⢠This project takes the help of the UK's Modern Slavery Act 2015 to build a Machine Learning tool to automize the analysis of statements produced by businesses under the UK and Australian Modern Slavery Acts to boost compliance and help combat and eradicate modern slavery.
⢠During my Internship, I build a Machine Learning-based approach to predict if the given business statement or MSA report is approved by the board of directors of that company or approved by someone else or approval not mentioned in the report.
⢠Used prebuild BERT and Roberta transformer models to perform the NLP task of predicting if the submitted report of a particular company is explicitly signed by its higher management or not.
⢠Also help in design and documentation of different metrics on which the accuracy of ML models is calculated on the provided MSA reports.
⢠Worked as an NLP Data Science/Engineer intern with the D-Ford Melbourne, the Human Centric Design Team of Ford Australia.
⢠Worked on the project of Ford Beacon Research Project which involves research about the future of Compact trucks of Ford like Ford Ranger in the global market. This research is carried out between 4 main regions China, Estonia, Ghana, and Thailand.
⢠As a Data Science intern, I used different Text Analysis techniques like Topic Modeling, Feature Engineering, and Sentiment Analysis to extract meaningful insights from the unstructured data that can be used to catalyze the D-Ford team's brainstorming process.
⢠Built a Flask based web application over the results of the above-mentioned techniques to give some interactive visualizations to the team
⢠Worked as an SAP DW/BI Developer for the UK's one of the largest Energy and Gas supplier clients EDF Energy
⢠Worked on the skills like SAP DW modeling, SAP BI Reporting, SAP BO reporting, BEx Query and Tableau Reporting.
⢠Developed a DW model right from scratch i.e. from extracting the data from ECC and ERP system, to transforming it, to storing it in DSOs and Cubes as per the client requirement, to generating reports on it and delivering (FTP) the result files to the client location.
⢠Worked in different business-related activities like Monthly UBR activity (Unbilled Billing Run), Selective and Full Refresh of data (to keep Source and BI system in sync), solving business Incidents and query resolution, infrastructure change deployment, etc.
⢠Worked on Business reports generation using BEx reporting tools and Open Hubs, Business Objects, and Visualization platforms like Tableau.
⢠Handled the user queries (Incidents) using platforms like Citrix Receiver Remedy, JIRA, and other query management tools.
⢠Worked on Access Management as a management task in the bucket.
* Professionally trained on the skills like SAP Data Warehousing and Business Intelligence modelling on SAP Netweaver.
* Worked on the project client EDF Energy which is one of the largest gas and electricity producer and supplier in Europe as a BI developer and reports analyst.