In recent years, a specific area within the field of artificial intelligence known as machine learning has successfully emulated various aspects of human intellect. Machine learning-based computer programs have demonstrated their prowess by outperforming world champions in games like chess and Go, emulating the styles of artists and writers, transcribing human speech, and facilitating autonomous vehicles. At its core, machine learning relies on mathematical principles that enable advanced pattern recognition. A critical aspect of machine learning involves the challenge of generalization. For instance, if a machine learning program is trained on a dataset illustrating the appropriate actions in specific scenarios, it should also be capable of extrapolating from that training to determine the proper course of action in novel situations. This is done by showing the program bunch of examples to the computer so that it can learn to generalise a certain concept from those examples.

The current exposure of machine learning in press means, every startup and corporation is interested in incorporating machine learning in their business. This hype is certainly problematic.

The goal of ML algorithms should be thought of as automating some tasks that humans are already quite capable of: that is, can they identify patterns that we can already see in the data, but do so in a more automatic and scalable manner?

Once we understand that, the question is How can we pick real world problems and try to solve them using this technology?
The most generic approach is to think about the ML definition. which is

A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.

Start with the informal problem description which describes the general goals. E.g I need a program that will tell me which tweets will get retweets.Convert it into a formal description For example:

  • Task (T): Classify a tweet that has not been published as going to get retweets or not.
  • Experience (E): A corpus of tweets for an account where some have retweets and some do not.
  • Performance (P): Classification accuracy, the number of tweets predicted correctly out of all tweets considered as a percentage.

For a lot of companies it can be a hypothesis. e.g Algorithm/feature/design X will increase member engagement with our service and ultimately member retention.

The above approach maybe too generic for our purposes. Here are some general heuristics for picking problems that are solvable by machine learning:

  1. You cannot code the rules: Many human tasks (such as recognising whether an email is spam or not spam) cannot be adequately solved using a simple (deterministic), rule-based solution. A large number of factors could influence the answer. When rules depend on too many factors and many of these rules overlap or need to be tuned very finely, it soon becomes difficult for a human to accurately code the rules. You can use ML to effectively solve this problem.
  2. You cannot scale: You might be able to manually recognized a few hundred emails and decide whether they are spam or not. However, this task becomes tedious for millions of emails. ML solutions are effective at handling large-scale problems.
  3. Personalization: The application requires that the software customize to its operational environment after it is fielded. One example of this is speech recognition systems that customize to the user who purchases the software. Machine learning here provides the mechanism for adaptation. Software applications that customize to users are growing rapidly - e.g., bookstores that customize to your purchasing preferences, or email readers that customize to your particular definition of spam. This machine learning niche within the software world is growing rapidly.
  4. Dataset availability: You have access to a sizeable set of data from which to train your model. There are no absolutes about how much data is enough, but every feature (data attribute) that you include in your model increases the number of instances (data records) you'll need to properly train it. Also account for splitting your dataset into three subsets: one for training, one for evaluation, and one for testing.
  5. Alternatives: You can't use an easier and more concrete mathematical model to solve your problem.

If we pass the initial criteria descriped in the checklist above, following tips maybe useful for getting started:

1. State the assumptions: Create a list of assumptions about the problem and it’s phrasing. These may be rules of thumb and domain specific information that you think will get you to a viable solution faster.It can be useful to highlight questions that can be tested against real data because breakthroughs and innovation occur when assumptions and best practice are demonstrated to be wrong in the face of real data. It can also be useful to highlight areas of the problem specification that may need to be challenged, relaxed or tightened. For example:

  • The specific words used in the tweet matter to the model.
  • The specific user that retweets does not matter to the model.
  • The number of retweets may matter to the model.
  • Older tweets are less predictive than more recent tweets.

3. What Type of data do they have? Identify ML problem category. Do we have enough data for models to generalise?

4. What accuracy would they be happy with? How will we measure the project/model’s success? What would the acceptance test look like?

5. Model Deployment (Serving): How will the model be deployed (On Cloud, on Premises?) or how will its results be presented? Will it become a feature of an existing product? Would it become a stand-alone product in which case we deploy the model as a predictive web service

7. Post Deployment Maintenance Strategy Discussion: It is useful to discuss how we the team will be able to deal with production challanges like concept drift or data drift.

Covering these fundamentals would put you on the right track for success in executing ML projects.