Job Description
Job Overview
ProperBird, a Munich-based startup in the Prop-tech industry is seeking a Python Data Crawler and Pipeline Manager to join our growing team. The successful candidate will manage the data crawling process for our real-time data collection and maintain and improve our data pipeline infrastructure. You will work closely with our development team to ensure the efficient operation of our data systems and databases.
Responsibilities
- Crawl new websites and streamline the data pipeline
- Develop, maintain, and improve data pipelines for real-time data collection
- Optimize code for efficiency and run time, particularly for large-scale operations
- Manage database operations including backups, restores, and migrations
- Monitor and maintain data quality and integrity
- Collaborate with the development team to identify and resolve data issues
- Clean and preprocess data for use in analytics and reporting
- Work with product management and other stakeholders to define data needs and requirements
Job Requirements
- Bachelor’s degree or higher in Computer Science, Engineering, or a related field
- Strong proficiency in Python
- Experience with Airflow, MongoDB, Postgres, Selenium CHR capture, Amazon s3, and Gitlab (CI/CD)
- Knowledge of data warehousing and ETL processes
- Experience with large-scale data processing, data modeling, and data architecture
- Excellent analytical and problem-solving skills
- Strong communication and collaboration skills
- Ability to work independently with minimal supervision