Data Scientist – NLP | HopHR
Job Description
The Data Scientist for the AI & Automation Solutions team will provide expert guidance on NLP concepts, assist in technical design reviews, build text corpuses for classification models, leverage all available data for the purposes of training/re-training intents, develop and execute virtual agent performance measurement strategy, and continuously research and prototype the latest data science concepts related to NLP and NLU.
The Data Scientist is responsible for working with the AI & Automation Solution Consultants, IVA Developers, and Conversation Designers to ensure highest levels of accuracy and performance in AI based solutions such as virtual agents, agent guided solutions, and others.
Job requirements
- NLP: Knowledge of NLP concepts and hands on experience in building Text classification ML/statistical model and implementing entity extraction algorithms. Ability to make data driven decisions to train and re-train virtual agent intents. Experience with extracting relevant data from user transcripts including cleaning, lemmatizing and building text corpus for building classification models.
- Machine Learning: Knowledge of and experience with designing and implementing algorithms (clustering, decision tree learning, GLM/Regression, Random Forest, text mining, social network analysis, etc), and the ability to articulate their real-world advantages and drawbacks.
- Statistical Methodology: Knowledge of techniques and concepts (sampling, A/B testing, regression, properties of distributions, statistical tests and proper usage, etc). Ability to understand and improve model performance.
- Business acumen: You understand the bigger picture for customers and the business and know how to probe beyond stakeholders’ stated requests to really understand their needs and the value of information.
Qualifications
- Graduate degree in Statistics, Mathematics, Human-Computer Interaction, Computer Science, Cognitive Science, Linguistics, Engineering or similar.
- 4+ years’ experience in Python, R, and applied statistics
- Familiarity with bot frameworks such as IBM Watson, Amazon Lex, Rasa or Google Dialogflow
- Experience with Deep Learning Frameworks and interpretability of deep learning models
- Experience with ML libraries, such as SciKit Learn, TensorFlow, and Keras on both CPU and GPU compute architectures.
- Hands on experience with LLMs, prompt engineering, fine-tuning and evaluation