UCL-CORU · zmek · Mar 24, 2025 · Jan 30, 2025 · Feb 4, 2025 · Feb 4, 2025
@@ -1,4 +1,4 @@
-# PatientFlow: Code and explanatory notebooks for predicting short-term hospital bed capacity using real-time data
+# PatientFlow: Predicting demand for hospital beds using real-time data
 
 [![pre-commit](https://img.shields.io/badge/pre--commit-enabled-brightgreen?logo=pre-commit&logoColor=white)](https://github.com/pre-commit/pre-commit)
 [![Tests status][tests-badge]][tests-link]
@@ -29,28 +29,26 @@
 [pypi-version]:             https://img.shields.io/pypi/v/patientflow -->
 <!-- prettier-ignore-end -->
 
-Welcome to the PatientFlow repository, which is designed to support hospital bed management through predictive modelling. I'm [Zella King](https://github.com/zmek/), a health data scientist in the Clinical Operational Research Unit (CORU) at University College London. Since 2020, I have worked with University College London Hospital (UCLH) NHS Trust on practical tools to improve patient flow through the hospital.
+Welcome to the PatientFlow repository, which provides predictive modelling for hospital bed management. I'm [Zella King](https://github.com/zmek/), a health data scientist in the Clinical Operational Research Unit (CORU) at University College London. Since 2020, I have worked with University College London Hospital (UCLH) on practical tools to improve patient flow through the hospital.
 
-My most important contribution is a software application that [Jon Gillham](https://github.com/jongillham) and I developed, which is now in daily use by bed managers at the hospital. That application generates predictions of emergency demand for beds, using real-time data from the hospital's patient record system, and sends the predictions to the bed managers. I created the predictive models that are used in the application. Jon created the software that runs my modelling code five times a day, and sends the predictions by email to the bed managers.
+With a team from UCLH, I developed a predictive tool that is now in daily use by bed managers at the hospital. The tool generates predictions of emergency demand for beds, using real-time data from the hospital's patient record system.
 
-I developed the code I wrote for UCLH into a reusable resource following the principles of [Reproducible Analytical Pipelines](https://analysisfunction.civilservice.gov.uk/support/reproducible-analytical-pipelines/). I did this because I want to:
+I am sharing the code I wrote for UCLH as a reusable resource because I want to make it easier for researchers to convert patient-level predictions into output that is useful for bed managers in hospitals. This repository includes a Python package, called patientflow, which converts patient-level predictions into output that is useful for bed managers. If you have a predictive model of some outcome for a patient, like admission or discharge from hospital, you can use patientflow to create bed count distributions for a cohort of patients.
 
-1. Share the code with researchers and NHS analysts who are work on similar models
-2. Make it easier for others to make use of the mathemetics involved in making these predictions
-3. Inform and educate anyone who wishes to adopt a similar approach
+The methods generalise to any problem where it is useful to convert patient-level predictions into outcomes for a whole cohort of patients at a point in time. The repository includes a synthetic dataset and a series of notebooks demonstrating the use of the package.
 
 ## Main features of my modelling approach
 
 - **Led by what users need:** My work is the result of close collaboration with operations directors and bed managers in the Coordination Centre, University College London Hospital (UCLH), since 2020. What is modelled directly reflects how they work and what is most useful to them.
-- **Focused on short-term predictions:** I am expert in predicting demand within a short time horizon eg 8 or 12 hours. Here I show models that predict how many beds will be needed emergency patients. (Later I plan to add modules that also predict elective demand, discharge and transfers between specialties.)
-- **Assumes real-time data is available:** Hospital bed managers have to deal with rapidly changing situations. My focus is on the use of real-time data (or near to real-time) to help them make informed decisions. The modelling shown here assumes that a hospital has some capacity to make use of real-time data in its electronic health record, even if this data is minimal.
+- **Focused on short-term predictions:** The modelling is designed for predicting demand within a short time horizon eg 8 or 12 hours. I show how to use my code to predict how many beds will be needed emergency patients. (Later I plan to add modules that for elective demand, discharge and transfers between specialties.)
+- **Assumes real-time data is available:** Hospital bed managers have to deal with rapidly changing situations. My focus is on the use of real-time data (or near to real-time) to help them make informed decisions.
 
 ## Main Features of this repository
 
 - **Reproducible** - I follow the principles of [Reproducible Analytical Pipelines](https://analysisfunction.civilservice.gov.uk/support/reproducible-analytical-pipelines/). The repository can be installed as a Python package, and imported into your own code.
 - **Accessible** - All the elements are based on simple techniques and methods in Health Data Science and Operational Research. I intend that anyone with some knowledge of Python could understand and adapt the code for their use.
-- **Practical:** - I believe that it is easier to follow the steps I took if you have access to the same data I have. UCLH have released an anomymised version of real patient data, which you can request access to on [Zenodo](https://zenodo.org/records/14866057), or you can use the synthetic dataset, derived from real patient data, in the `data-synthetic` folder. (Note that, if you use the synthetic dataset, the integrity of relationships between variables is not maintained and you will obtain articifically inflated model performance.)
-- **Interactive:** The repository includes a set of notebooks with code written on Python, with commentary. If you clone the repo into your own workspace and have an environment for running Jupyter notebooks, you will be able to interact with the code and see it running.
+- **Practical:** - I believe that it is easier to follow the steps I took if you have access to the same data I have. UCLH have released an anonymised version of real patient data, which you can request access on [Zenodo](https://zenodo.org/records/14866057), or you can use the synthetic dataset, derived from real patient data, in the `data-synthetic` folder. (Note that, if you use the synthetic dataset, you will observe articifically inflated model performance.)
+- **Interactive:** The repository includes a set of notebooks with code written on Python and commentary. If you clone the repo into your own workspace and have an environment for running Jupyter notebooks, you will be able to interact with the code and see it running.
 
 ## Getting started
 
@@ -85,7 +83,7 @@ If you get errors running the pytest command, there may be other installations n
 
 The data provided (which is synthetic) can be used to demonstrate training the models. To run training you have two options
 
-- step through the notebooks (for this to work you'll either need copy the two csv files from `data-synthetic`into your `data-public` folder or contact us for real patient data)
+- step through the notebooks (for this to work you'll either need copy the two csv files from `data-synthetic`into your `data-public` folder or request access on [Zenodo](https://zenodo.org/records/14866057) to real patient data
 - run a Python script using following commands (by default this will run with the synthetic data in its current location; you can change the `data_folder_name` parameter if you have the real data in `data-public`)
 
 ```sh
@@ -104,13 +102,12 @@ The data_folder_name specifies the name of the folder containing data. The funct
 
 ## About
 
-This project was inspired by the [py-pi template](https://github.com/health-data-science-OR/pypi-template) developed by Tom Monks, and is developed in collaboration with the
-[Centre for Advanced Research Computing](https://ucl.ac.uk/arc), University
-College London.
+This project was inspired by the [py-pi template](https://github.com/health-data-science-OR/pypi-template) developed by Tom Monks, and is based on a template developed by the
+[Centre for Advanced Research Computing](https://ucl.ac.uk/arc), University College London.
 
 ### Project Team
 
-Dr Zella King, Clinical Operational Research Unit (CORU), UCL ([zella.king@ucl.ac.uk](mailto:zella.king@ucl.ac.uk))
+Dr Zella King, Clinical Operational Research Unit (CORU), University College London ([zella.king@ucl.ac.uk](mailto:zella.king@ucl.ac.uk))
 Jon Gillham, Institute of Health Informatics, UCL
 Professor Sonya Crowe, CORU
 Professor Martin Utley, CORU