Position Overview:
We are seeking a solidly technical and battle-tested Big Data Developer to join our team in Japan's nuclear
Heart data platform team. You will deeply participate in the design and construction of the company's next-generation data platform, responsible for scheduling and optimizing high performance
A scalable data processing architecture that covers the entire chain from data collection, cleaning, storage to real-time processing and service-oriented architecture
Road. The platform will directly support the company's data-driven strategy, empower business to quickly respond to market changes, and provide algorithms and partition
Analyze and product teams provide strong data support.
Here, you will have the opportunity to come into contact with massive amounts of data, high-concurrency stream processing, and diverse technical components (such as Flink,
ClickHouse, Kafka, etc.), and participate in the process of key systems from 0 to 1. If you are passionate about data engineering
Join our team and challenge technical boundaries in an international, multi-team collaboration environment!
Job Responsibilities:
• Participate in the design, development and optimization of the company's big data platform, supporting business sustained growth and data-driven decision making.
• Responsible for building, performance tuning, monitoring and daily operations of the ClickHouse cluster, ensuring high availability and high performance of the system
Performance.
• Participate in the development, deployment and operation of Flink stream processing tasks, ensuring the stable operation of real-time data processing pipelines
Go.
• Support business-side data service requirements, develop data API services based on Java Spring Boot
• Assist the data science team in building a data pipeline to support the training and online services of algorithm models.
2. Continuously optimize data processing workflows and system architecture, improve system stability and scalability.
• Actively take on responsibilities in the project, drive problem solving, and embody the spirit of ownership. I can maintain a good attitude under high pressure.
Action and communication skills.
Position Requirements:
• Bachelor's degree or above, computer science, software engineering, information technology, etc. related majors are preferred.
• More than 5 years of big data experience.
• Solid database foundation: familiar with relational databases (such as MySQL, PostgreSQL) and columnar databases.
• Understand the principles of ClickHouse, Doris, etc., and learn about their underlying storage, query optimization, and other key mechanisms.
• Strong development skills: Proficient in Java, with the ability to develop REST API services using the Spring Boot framework,
• Experience.
• Excellent communication and collaboration skills, able to work closely with data analysis, product and business teams
• Strong sense of responsibility, good stress resistance and problem-solving abilities, and proactive in pushing work forward.
Bonus points if you also have:
1. Have practical experience with Kafka, Zookeeper, HDFS, Hive, Airflow, etc. components;
2. Have recommendation system, user profiling, personalized sorting, etc. related project experience;
3. Have prediction algorithms (such as time series prediction, user behavior prediction) related background;
4. Familiar and practiced large language models (LLM, such as ChatGPT, Llama, etc.) in business or systems Application scene
Scenes (such as data augmentation, automatic summarization, intelligent question answering, etc.);
5. Participated in real-time data warehouse and metric system building projects;
6. Candidates with experience in building a big data platform from scratch are preferred.
3ClickHouse
• Proficient in ClickHouse data modeling, query optimization, and resource management;
• Have practical experience in cluster operations, fault diagnosis and performance monitoring;
• Familiar with ClickHouse and data middleware/stream processing/scheduling system integration
Flink
• Familiar with Flink's underlying mechanisms, such as checkpoint, state backend, window mechanism, etc.;
• Able to independently write complex Flink real-time processing tasks and solve performance issues during execution;
• Have Flink cluster deployment and stability maintenance experience.