- 🎓 Software Engineer (Program completed – degree pending)
- 🛠️ Junior Data Engineer with hands-on experience building data pipelines for real-world datasets
- 🔄 Focused on ETL/ELT workflows, data modeling, and analytics-ready data layers
- 📊 Background in applied Machine Learning and data-driven research (3 published articles)
- 💻 Strong proficiency in Python and SQL
- 🌍 Fluent English (technical & professional)
- Build and maintain ETL pipelines from raw data to structured datasets
- Clean, validate, and transform real-world, noisy data
- Design relational data models optimized for analytics and ML
- Work with large tabular datasets using Python, SQL, and Spark
- Prepare data layers consumed by ML models and analytical tools
- Programming: Python, SQL
- Data Engineering:
- ETL / ELT pipeline development
- Batch-oriented data processing
- Data cleaning, validation, and transformation
- Relational data modeling
- Big Data: Apache Spark, PySpark, Spark SQL
- Orchestration: Apache Airflow (DAGs, scheduling, dependencies)
- Transformations: dbt
- Databases: PostgreSQL, MySQL, SQLite
- Cloud Fundamentals: AWS / GCP
- ML Support: feature engineering, dataset preparation, reproducible pipelines
- Version Control: Git, GitHub
- Co-author of 3 applied Machine Learning articles using real experimental datasets
- Worked with structured and semi-structured data extracted from scientific sources
- Designed reproducible data preprocessing pipelines for modeling and evaluation
- Strong focus on data quality, consistency, and interpretability
- 🎓 Ingeniero de Software (programa finalizado – grado pendiente)
- 🛠️ Data Engineer Junior con experiencia práctica en pipelines de datos
- 🔄 Enfocado en procesos ETL/ELT, modelado de datos y capas analíticas
- 📊 Experiencia en Machine Learning aplicado y datos reales (3 artículos publicados)
- 💻 Dominio sólido de Python y SQL
- 🌍 Inglés avanzado (técnico y profesional)
- Construcción de pipelines de datos desde fuentes crudas hasta datasets estructurados
- Limpieza, validación y transformación de datos reales y ruidosos
- Diseño de modelos relacionales para análisis y ML
- Procesamiento de datos con Python, SQL y Spark
- Preparación de datasets para consumo analítico y científico
- Programación: Python, SQL
- Ingeniería de Datos: pipelines ETL/ELT, procesamiento batch, modelado relacional
- Big Data: Apache Spark, PySpark
- Orquestación: Apache Airflow
- Transformaciones: dbt
- Bases de Datos: PostgreSQL, MySQL, SQLite
- Cloud (base): AWS / GCP
- Soporte a ML: preparación de datasets y features
- Control de Versiones: Git, GitHub



