Logo of Huzzle

Job

Research Engineer Graduate (Machine Learning Sys - US) - 2024 Start (PhD) - Seattle

Logo of TikTok

TikTok

Oct 19

💼 Graduate Job

Seattle

AI generated summary

  • The ideal candidate for the Research Engineer Graduate role at Tiktok should have a PhD in Computer Science or a related technical field, strong coding ability in C/C++ or Python, experience in large-scale distributed systems, and familiarity with machine learning frameworks. They should also have prior experience in large-scale projects, NLP, CV algorithms, and a curiosity towards new technologies and entrepreneurship.
  • The candidate will be responsible for the design and development of large-scale machine learning systems, addressing technical challenges in concurrency, reliability, and scalability. They will work on resource scheduling, model training, inference, data management, and workflow orchestration. Additionally, they will research and introduce advanced technologies in machine learning systems and collaborate with algorithm teams to optimize the algorithm and system jointly.

Graduate Job

DataSeattle

Description

  • The Applied Machine Learning - Machine Learning Systems team provides E2E machine learning experience and machine learning resources for the company. The team builds heterogeneous ML training and inference systems based on GPU and advanced chip technology and advances the state-of-the-art of ML systems technology to accelerate models such as stable diffusion, language modeling and multi-modality models. The team is also responsible for the research and development of hardware acceleration technologies for cloud computing, via technologies such as distributed systems, compilers, HPC, and RDMA networking. The team is reinventing the ML infra for large-scale language models.
  • We are looking for talented individuals to join our team in 2024. As a graduate, you will get unparalleled opportunities for you to kickstart your career, pursue bold ideas and explore limitless growth opportunities. Co-create a future driven by your inspiration with TikTok.
  • Successful candidates must be able to commit to a start date before the end of 2024. Please state your availability and graduation date clearly in your resume.

Requirements

  • PhD graduate with a background in Computer Science, related technical field or equivalent industrial research experience
  • Must obtain work authorization in the country of employment at the time of hire, and maintain ongoing work authorization during employment.
  • Excellent coding ability, solid foundation in data structures and basic algorithms, proficient in C/C++ or Python, winners of ACM/ICPC, NOI/IOI and other competitions are preferred.
  • Familiar with at least one mainstream machine learning framework (TensorFlow/PyTorch/Jax).
  • Master the principles of distributed systems, and participated in the design, development, and maintenance of large-scale distributed systems.
  • Strong sense of responsibility, good learning ability, communication ability, and self-motivation.
  • Good communication and collaboration skills, able to explore new technologies with the team and promote technological progress.
  • Preferred Experience:
  • Prior experience in large-scale projects or papers with great influence in the field of large models.
  • Familiar with NLP, CV-related algorithms, and technologies, and experienced in large model training and RL algorithms.
  • Experience in one of the following fields: CUDA, RDMA, AI Infrastructure, HW/SW Co-Design, High-Performance Computing (cutlass, NCCL), ML Hardware Architecture (GPU, Accelerators, Networking), ML for System, and Distributed Storage.
  • Demonstrated a related technical experience from previous internship, work experience, coding competitions, or publications
  • Curiosity towards new technologies and entrepreneurship
  • High levels of creativity and quick problem-solving capabilities

Education requirements

PhD

Area of Responsibilities

Data

Responsibilities

  • Responsible for the machine learning system development of the company's large-scale models, researching new applications and solutions of related technologies in areas such as search, recommendation, advertising, content creation, conversation, and customer service, meeting the growing demand for intelligent interaction from users, and comprehensively improving users' lifestyles and communication methods in the future world.
  • The main work directions include:
  • Responsible for the design and development of the architecture of large-scale machine learning systems, solving technical difficulties such as high concurrency, high reliability, and high scalability of the system.
  • Covering various sub-directions of machine learning system, including resource scheduling, model training, model inference, data management, and workflow orchestration.
  • Responsible for the research and introduction of advanced technologies in machine learning systems, such as the latest hardware architecture, heterogeneous computing systems, and compiler-based optimization technologies.
  • Working closely with the algorithm teams to optimize the algorithm and system jointly.

Details

Work type

Full time

Work mode

office

Location

Seattle