PDF Version
August 2021 - Present
- Develop ETL/automation/pipeline jobs with Airflow
- Help maintain or develop new features in existing Flask applications
- Use Spark (AWS EMR) for production jobs that join multiple data sources and writes parquet files.
- Developed and currently maintain Spark pipeline via multiple Airflow dags that is critical to data ops
- Create automation tools for a variety of needs like working with AWS and parsing large json or parquet files
- Perform ad hoc data analysis using Jupyter, DuckDB and Polars
- Use docker and vagrant for local development
- Work in agile culture in 2 week sprints
Data Analyst/Scientist at CH Robinson
January 2013 - August 2021
- Develop BI apps using frameworks like Flask, Streamlit, R Shiny and Plotly
- Use BI apps like Power BI to create dashboards
- Develop in linux environment and use docker for containerization
- Use simple modeling techniques like regression to get insights
- Use Jupyter notebooks for data analysis and visualizations
- Write complex queries in MS SQL Server, Hadoop/Hive and Postgres
- Develop data pipelines using python and Airflow
- Work in agile environment and use git for managing code/projects
- General
- Python
- Docker
- Flask
- Linux Scripting
- Javascript
- Data
- Spark
- SQL (Postgres, MS SQL)
- Airflow
- REST APIs
- Polars / DuckDB
- Cloud
- AWS EC2
- AWS EMR
- AWS Lambda
- AWS Cloudwatch
- Python Boto3
Information Technology - B.S.
- Programming and relational databases focused coursework
Atmopheric Science - B.S.
- Research student at NSSL and also earned a minor in Mathematics