ML in the public psychiatric hospitals

Public health care could use Machine Learning, if larger and open datasets were present

Published on: 07/06/2021

Social media and some other organisation use Machine Learning (ML) to pick up signs of mental illness and conditions. Some developers have worked on making open source ML, which potentially could be used in the public sector, but even the developers themselves point to a bias in the algorithm due to the small dataset available. An organisation using ML to detect suicidal tendencies have open respositories, but only for their website.

Photo by Dustin Belt on


The psychiatry in the public hospitals faces a massive challenge, because two years ago World Health Organisation (WHO) estimated that mental illnesses affects 264 million yearly. That is approximately 5,5% of the world’s population struggling with mental health issues.

To combat these illnesses, different kinds of automation and Machine Learning (ML) could be used as supplement to the regular treatment. These counts for instance voice recognition or biometric analysis to support human doctors in the hospital. However, to teach the machine to predict required access to a large database.


Open source data

Many mental health prediction notes in the Kaggle’s open source environment – a sandbox to Github owned by Google – are based on open datasets like Mental Health in the Tech Industry. These datasets is collected by the Open Source Mental Illness (OSMI) organisation. OSMI works to change ‘how we talk about mental health in the tech community’. It was built over a duration of six years, through surveys.

OSMI’s surveys dooes not contain personal information unlike the confidential health data from public (and private) hospitals. OSMI surveys the tech community and they are therefore biased. This is therefore also true for their predictions.


Not so open algorithms

That is not a problem for Facebook. In 2018 the tech giant made an algorithm to predict and react to suicide risk. Facebook has their own dataset from the users' activities and this initiative was by criticism on lack of ethics and claims of bias.

Similarly, to Facebook, the Crisis Text Line’s ML repository is based on a large amount of data. Their own data. Crisis Text Line’s algorithm is set to identify who is more urgent to talk to and in largest risk of suicide attempt. Crisis Text Line has 23 repositories on their Github, however, all of them are on their website and its design.


The many layers of ethics

There are at triple layer of ethics in this case. One: Should or shouldn’t we let doctors be assisted by ML? Ethic issues number two: If doctors may be assisted by ML, where do we get the data the prediction is based on from? And lastly: If we already have access to the data, should the algorithms be Open Source so that everyone can corrected it for bias?



Final take-aways

  • Machine Learning (ML) requires large dataset to estimate correctly and with as little bias as possible.
  • Social medias and suicidal emergency line are using their own data to create ML but the repositories are closed.
  • A need for open source ML is up to hospitals needing voice, text, or biometric analyses.