Last year, for the first time, the ORR published the Origin-Destination Matrix - a dataset consisting of an estimate, from ticketing data, of the number of journeys between every pair of stations. They've just now released the 2022-23 data. As the process of making an account to download it is something of a pain, I've re-uploaded it here:
drive.google.com
Note that this file is about 140MB uncompressed. I've sorted it by number of journeys, to avoid issues with Excel deleting rows past around a million - it'll still happen, but the journeys that get lost will be rare ones. Also, I'm legally required to tell you that it Contains public sector information licensed under the Open Government Licence v3.0.
The top journeys were:
1. London Liverpool Street - Tottenham Court Road (2896936)
2. Stratford (London) - London Liverpool Street (2715657)
3. London Liverpool Street - Stansted Airport (2529592)
4. Barking - West Ham (2448000)
5. London Victoria - Gatwick Airport (2292005)
6. Tottenham Court Road - London Paddington (2197641)
7. Reading - London Paddington (2045987)
8. Tottenham Court Road - Bond Street (1964482)
9. London Paddington - Bond Street (1860320)
10. Stratford (London) - Romford (1816911)
This data is estimated from ticketing numbers, so in some cases it'll be wrong. This is especially apparent for stations ticketed as groups (Glasgow Central/Queen Street, London Terminals, etc), where they use various rules to estimate which station was likely used in practice. They've improved the methodology for London Terminals this year (so there's no longer the strange situation where most Sheffield passengers went to Kings Cross), but other anomalies are likely to remain. If allocation between group stations looks wrong, it probably is. Similarly, things like PTE day/concessionary tickets and incomplete contactless journeys will affect the accuracy of the data to some degree, with the latter manifesting as journeys from a station back to the same station.
ODM-22-23-sorted.csv.zip

Note that this file is about 140MB uncompressed. I've sorted it by number of journeys, to avoid issues with Excel deleting rows past around a million - it'll still happen, but the journeys that get lost will be rare ones. Also, I'm legally required to tell you that it Contains public sector information licensed under the Open Government Licence v3.0.
The top journeys were:
1. London Liverpool Street - Tottenham Court Road (2896936)
2. Stratford (London) - London Liverpool Street (2715657)
3. London Liverpool Street - Stansted Airport (2529592)
4. Barking - West Ham (2448000)
5. London Victoria - Gatwick Airport (2292005)
6. Tottenham Court Road - London Paddington (2197641)
7. Reading - London Paddington (2045987)
8. Tottenham Court Road - Bond Street (1964482)
9. London Paddington - Bond Street (1860320)
10. Stratford (London) - Romford (1816911)
This data is estimated from ticketing numbers, so in some cases it'll be wrong. This is especially apparent for stations ticketed as groups (Glasgow Central/Queen Street, London Terminals, etc), where they use various rules to estimate which station was likely used in practice. They've improved the methodology for London Terminals this year (so there's no longer the strange situation where most Sheffield passengers went to Kings Cross), but other anomalies are likely to remain. If allocation between group stations looks wrong, it probably is. Similarly, things like PTE day/concessionary tickets and incomplete contactless journeys will affect the accuracy of the data to some degree, with the latter manifesting as journeys from a station back to the same station.
Last edited: