Responsibilities:
- Design and implement efficient and stable web crawling programs to collect massive amounts of data and process and analyze it.
- Responsible for research and optimization work such as multi platform information extraction, data cleaning, warehousing, and serviceization.
- Resolve various data requirements and interface issues encountered during the actual development process.
- Participate in business requirement discussions, responsible for the implementation and execution of solutions from business requirements to technical implementation.
- Monitor the running status of web crawlers and handle the stability and accuracy issues of data crawling on a daily basis.
Knowledge and Skills
- Responsible for designing, developing, and maintaining efficient and stable web crawling systems to improve the efficiency and quality of data collection.
- Responsible for research and optimization work such as multi platform information extraction, data cleaning, warehousing, and serviceization.
- Familiar with various web crawling frameworks and tools.
- Proficient in programming with Python and Java languages.
- Familiar with Linux basic commands.
- Proficient in using mainstream open source frameworks and tools such as SpringBoot, SpringCloud, Maven, Redis, MyBatis, etc.
- Familiar with the design and application of distributed systems, familiar with mechanisms such as distribution, caching, and messaging.