Responsibilities:

  1. Design and implement efficient and stable web crawling programs to collect massive amounts of data and process and analyze it.
  2. Responsible for research and optimization work such as multi platform information extraction, data cleaning, warehousing, and serviceization.
  3. Resolve various data requirements and interface issues encountered during the actual development process.
  4. Participate in business requirement discussions, responsible for the implementation and execution of solutions from business requirements to technical implementation.
  5. Monitor the running status of web crawlers and handle the stability and accuracy issues of data crawling on a daily basis.

Knowledge and Skills

  1. Responsible for designing, developing, and maintaining efficient and stable web crawling systems to improve the efficiency and quality of data collection.
  2. Responsible for research and optimization work such as multi platform information extraction, data cleaning, warehousing, and serviceization.
  3. Familiar with various web crawling frameworks and tools.
  4. Proficient in programming with Python and Java languages.
  5. Familiar with Linux basic commands.
  6. Proficient in using mainstream open source frameworks and tools such as SpringBoot, SpringCloud, Maven, Redis, MyBatis, etc.
  7. Familiar with the design and application of distributed systems, familiar with mechanisms such as distribution, caching, and messaging.