Ho presentato la mia candidatura tramite un selezionatore. La procedura ha richiesto 4 settimane. Ho sostenuto un colloquio presso Edit (Bath, Inghilterra) nel mese di ott 2023
Colloquio
Teams interview with director of data science to discuss background and current CV,
Second interview followed with an 8 question worksheet, where the answers had to be presented as a 30-45 minute presentation to team members talking through my answers and workings. The questions were both mathematical and code based.
Following this I was ghosted and received no response in any direction.
Domande di colloquio [8]
Domanda 1
Dangerous fires are uncommon – 0.5%. Smoke is fairly common – 15%, and 90% of dangerous fires make smoke. What % of occasions smoke means dangerous fire? Please briefly explain your answer.
You have trained an initial model to predict the propensity for charity supporters to leave a donation to charity in their will. Validation has shown that the model achieves 98% accuracy. Is this performance acceptable? Why or why not? If not, what steps could be taken to improve model performance?
Last year, your business conducted segmentation on their customers using first- and third-party data. Some of the third-party variables are no longer available. What approach(es) could we take to be able to continue using the solution?
We have provided two data tables, containing Supporter data and their donations to a charity. Using a language of your choice (e.g. R, Python, SQL) and the .csv files provided, write the code to create an example dataset that could be used as a starting point to train a model to predict a supporter’s propensity of giving a 2nd donation.
You are not expected to complete extensive data exploration or cleaning.
For each piece of analysis work described below, specify the most appropriate statistical analysis or machine learning technique. Some items have more than one (equally) appropriate technique – please choose one to specify in your answer. Please include a brief description of any assumptions or limitations to your given answer.
a) Predicting how much (individual) customers will spend in the next financial year.
b) Modelling a time series of monthly sales data
c) Identifying the key dimensions within a large number of variables relating to customer attitudes.
d) Carrying out a segmentation to identify groups of customers with similar attitudes
You have been asked to generate a probability of customer being ‘”at risk of churn”. The client has provided data on 800,000 customers and their transactions.
Outline the steps required as part of an end-to-end plan to identify “at-risk” customers, including approaches, considerations, limitations/risks, and further recommendations/next steps where appropriate.