Software Engineer, Web
News Break
About NewsBreak
NewsBreak is redefining the way users interact with local news and their communities. By bridging local users, local content creators, and local businesses, our mission is to foster safer, more vibrant, and authentically connected lives. Through robust collaborations with thousands of local publishers and businesses across the nation, NewsBreak is revolutionizing how a new wave of readers access and engage with essential, locally sourced content & information.
Since our inception in 2015, our trajectory has been nothing short of remarkable. We proudly stand as the nation’s premier local news app.
As a Series-C unicorn startup, our headquarter nestles in the tech hub of Mountain View, California, with other offices in New York City and Seattle. For more information, visit www.newsbreak.com/about
Location: Mountain View, California, United States (Onsite 4 days in office is required)
About the Role
As a Software Engineer-Web, you will play a pivotal role in building and optimizing our data acquisition infrastructure, ensuring the seamless collection and processing of web data to support critical business needs. You will tackle complex technical challenges in web crawling, enhance the performance of large-scale distributed systems, and contribute to the development of innovative solutions that drive our data-driven decision-making processes. If you are passionate about web technologies, data extraction, and system optimization, this is the perfect opportunity to make a significant impact.
Responsibilities
- Design, develop, and maintain distributed web crawler systems, ensuring efficient data scheduling, scraping, parsing, and storage.
- Collect and process data from the internet and partner sources in compliance with website policies and legal regulations.
- Optimize crawler performance to handle large-scale data extraction with high efficiency and reliability.
- Solve complex technical challenges related to web crawling, including anti-crawling mechanisms, dynamic content rendering, and data quality assurance.
- Collaborate with cross-functional teams, including data scientists, backend engineers, and product managers, to meet diverse business data requirements.
- Stay updated with the latest web technologies and crawling frameworks to continuously improve system capabilities.
Requirements
- Bachelor’s degree or higher in Computer Science, Engineering, or a related field, with at least 2 years of experience in web crawling and data collection.
- Proficiency in mainstream web scraping technologies and frameworks/tools such as Scrapy, Selenium, and Puppeteer.
- Strong coding skills in at least one programming language, such as Python, Java, Go, or C++.
- Solid understanding of web technologies, including HTML, CSS, JavaScript, and web protocols (HTTP/HTTPS).
- Experience with distributed systems, data pipelines, and storage solutions is a plus.
- Strong problem-solving skills, attention to detail, and the ability to work independently or as part of a team.
- Familiarity with frontend development and dynamic content rendering techniques is preferred.
Benefits
We offer a competitive benefits package:
- Health, dental, and vision care for you and your family (100% coverage for employee)
- Top-tier 401(K) plan with company matching
- Paid time off and paid holidays
- FSA, HSA and commuter benefits programs
- Team activity budget