Veeva Systems is a mission-driven organization and pioneer in industry cloud, helping life sciences companies bring therapies to patients faster. As one of the fastest-growing SaaS companies in history, we surpassed $3B in revenue in our last fiscal year with extensive growth potential ahead.
At the heart of Veeva are our values: Do the Right Thing, Customer Success, Employee Success, and Speed. We're not just any public company – we made history in 2021 by becoming a
public benefit corporation (PBC), legally bound to balancing the interests of customers, employees, society, and investors.
The Role
Data scientists play a pivotal role as data-driven decision engines within Veeva OpenData. We expect their work to effectively translate business teams' ambiguous requests into quantifiable data problems, leverage data analysis to provide a solid foundation for decision-making, and further improve the overall data product via new algorithms. We also expect the candidate to combine strong algorithm development capabilities with business acumen, enabling the translation of cutting-edge AI technologies into actionable business value.
The Data Scientist position is part of the Veeva OpenData Product team. This role provides data-backed support to both internal and external customers while coordinating with cross-functional teams to ensure the delivery of related features and to keep product excellence.
Lead the design and iterative upgrades of data matching algorithms, including HCP matching, HCO matching, and other business scenarios involving matchingBe responsible for the design and monitoring of data validation results' storage functions in the main databaseManage internal and external data sources currently used in OpenData, including data source collection, data structure transformation, and update mechanism designUtilize NLP, vectorization, and large language model technologies to design algorithms that address business team challenges, optimizing performance and efficiency in business scenariosCollaborate with business teams to analyze production-related issues through data analysis and provide reasonable solutionsConsolidate tool requirements arising from data management processes, draft requirement documents, and coordinate development team resources to ensure timely implementation
At least bachelor’s degree in math, statistics, computer science or equivalent relevant working experience5+ years of experience in data modeling or algorithm development, with complete project implementation casesProficient in Python/Excel, familiar with frameworks such as TensorFlow/PyTorchMaster the principles and tuning methods of algorithms such as decision trees, SVMs, and neural networksExpertise in relational databases or data warehouse products, like MySQL, PostgreSQL, Redshift, Snowflake.Proficient in prompt word engineering, and able to skillfully use tools such as Dify for prototype verificationStrong data sensitivity and logical analysis abilityExcellent cross-team communication and results transformation capabilitiesFluent in written English and good in oral English communications
Ability to use tools such as powerBI or tableau to present data analysis resultsPharma industry knowledge;Master data management experience
Grants for fitness and communication Healthy, free, provided snacks