FARE ACCESSIBILITY DATA: README

This is an annex to my dissertation about public transport accessibility for low-income earners.
For an English account of the context, have a look at this JTranGeo paper:
DOI: 10.1016/j.jtrangeo.2025.104348
For a German account of the context, see my thesis fulltext (especially the method section 3):
DOI: 10.15480/882.13161
An English summary of my thesis can be found in the Fare Accessibility Dashboard: DOI:10.15480/882.13164

This README is a simple guide to the basic terms and the structure of the three data sets that I built for my spatial regressions. Unfortunately, I can’t provide all the variables because of the terms and conditions of some of the data sets. In any case, these data will hopefully be informative to anyone studying public transport accessibility and affordability.

CC BY-SA 4.0 icon

You are welcome to re-use, adapt and share the data according to the Attribution-ShareAlike 4.0 International license.

Christoph Aberle
Hamburg University of Technology
Institute for Transport Planning and Logistics
ORCID: 0000-0003-0982-4869
christoph.aberle@tuhh.de
The TUHH email will be offline soon. If you use the dataset, or have a question, please drop me a line at christoph [at] fluegelrad [dot] net. I’m always curious to see what others get out of my data!

Basic terms

Term Explanation Link
PT Authority “Verkehrsverbund” that organises public transport, mainly in terms of planning and ticketing, in this data set either HVV or VBB Wikipedia
HVV PT Authority for Hamburg and surroundings Wikipedia
VBB PT Authority for Berlin and Brandenburg Wikipedia
Municipality Lowest level of official territorial division (“Gemeinde”) Wikipedia
Statistisches Gebiet (Hamburg) One of 941 areas that are used by the official social monitoring (“RISE / Rahmenprogramm Integrierte Stadtteilentwicklung”) Paper that provides a summary
Planungsraum (Berlin) One of 447 areas that are used by urban planners and others Paper that works with these areas (e.g. Fig. 4 & 7)
AGS / Amtlicher Gemeindeschlüssel Official Municipality Key Wikipedia
EPSG Spatial Reference System Identifier for my geodata Wikipedia

Dataset Description

I provide two geopackage files:

Each of these files contains three layers that represents different levels:

All data refer to the state 12/2018 or to the 2018/19 timetable.

Stop level

This level contains data for public transport stops in the HVV and VBB service areas.
For aggregates and violin plots of the input variables, see the model reports (HVV_Stop.html and VBB_Stop.html).

Column Data Type Explanation Link / Source
id INT stop id, primary key
name VARCHAR stop name
ur_int INT case area: 1 = Hamburg / 2 = Berlin / 3 = HVV outside Hamburg / 4 = VBB outside Berlin
ags VARCHAR Official Municipality Key Wikipedia
bestmode INT the ‘best’ means of transport that departs here, as per capacity: 1 = local and regional railways (SPNV, U-Bahn) / 2 = buses and trams / 3 = ferries
cells BIGINT count of populated raster cells within 800m radius
outlier INT 1 = stop is an outlier of the whole data set (beyond ± 1.5 IQR) Wikipedia, section 3.3 of my thesis (in German)
outlier_u INT as above, for the urban subset as above
outlier_r INT as above, for the rural subset as above
rs17 INT spatial type of the surrounding municipality according to the federal RegioStaR typology RegioStaR handbook
rs_class INT spatial class: 1 = urban / 0 = rural based on the RegioStaR typology, within Berlin and Hamburg based on local typologies, see section 3.1 of my thesis (in German)
rs_hh VARCHAR spatial type of the Statistisches Gebiet: c = city (central business district) / i = inner town / z = in-between zone (“Zwischenzone”) / s = fringe (“Stadtrand”) / g = industrial (“Gewerbe”) / l = rural (“Ländliches Hamburg”; not to confuse with the value of rs_class!) This attribute is only available for Hamburg. For Berlin, the urban/rural information is simply coded within the rs_class attribute). based on Gesa Matthes’ 2010 typology, updated to the state of 12/2018, see Annex A3 of my thesis (in German)
km_centre NUMERIC distance to the next centre in km, calculation based on official documents, see section 3.2 of my thesis (in German)
ptx NUMERIC public transport service index (no unit), aggregated from grid level (median of populated cells within 800m radius) see Aberle et al. 2025 section 3.2.2, inspired by Delbosc&Currie (2011)
ptx_cap NUMERIC as above, but the median of per-capita ptx of all grid cells within 800m radius (i.e. for each cell, ptx was divided by number of residents)
t1 NUMERIC fare accessibility on a €1.70 budget (where applicable), ln’ed and normalised and weighted see Aberle & Gertz 2025
t2 NUMERIC fare accessibility on a €2.30 budget, ln’ed and normalised and weighted as above
t2sum NUMERIC fare accessibility on a €2.30 budget, absolute and weighted for Lorenz curves (I summed up all weighted and non-ln’ed destinations across 15 categories e.g. 0.19 · grocery store count + 0.15 · doctors count …) the weights can be found in Aberle & Gertz 2025, table 3
ttime INT travel time to the next destination (minutes; weighted average across 15 categories), see section 3.2 of my thesis (in German)
standardized variables
km_centre_zt NUMERIC as above, normalised to MEAN=0 and SD=1 Wikipedia
ptx_zt NUMERIC -”- -”-
ptx_cap_zt NUMERIC -”- -”-
t1_zt NUMERIC -”- -”-
t2_zt NUMERIC -”- -”-
ttime_zt NUMERIC -”- -”-
geom GEOMETRY (POINT) note that HVV and VBB have different SRIDs

Municipality level

This level contains data for municipalities in the HVV and VBB service areas.
For the cities of Hamburg and Berlin, I’ve complemented the dataset with geometries for the statistical areas (Hamburg: Statistische Gebiete / Berlin: Planungsraum, see Basic Terms above.
For aggregates and violin plots of the input variables, see the model reports (HVV_Municipality.html and VBB_Municipality.html).

Column Data Type Explanation Link / Source
agsx VARCHAR Municipality Key (“Amtlicher Gemeindeschlüssel”). Within Hamburg and Berlin: Followed by a ‘-’ and the id of the Statistical Area, primary key Wikipedia
name VARCHAR name
ur_int INT case area: 1 = Hamburg / 2 = Berlin / 3 = HVV outside Hamburg / 4 = VBB outside Berlin cells
outlier INT 1 = grid cell is an outlier of the whole data set (beyond ± 1.5 IQR) Wikipedia, section 3.3 of my thesis (in German)
outlier_u INT as above, for the urban subset as above
outlier_r INT as above, for the rural subset as above
rs17 INT spatial type of the municipality according to the federal RegioStaR typology RegioStaR handbook
rs_class INT spatial class: 1 = urban / 0 = rural based on the RegioStaR typology, within Berlin and Hamburg based on local typologies, see section 3.1 of my thesis (in German)
rs_hh VARCHAR spatial type of the Statistisches Gebiet: c = city (central business district) / i = inner town / z = in-between zone (“Zwischenzone”) / s = fringe (“Stadtrand”) / g = industrial (“Gewerbe”) / l = rural (“Ländliches Hamburg”; not to confuse with the value of rs_class!) This attribute is only available for Hamburg. For Berlin, the urban/rural information is simply coded within the rs_class attribute). based on Gesa Matthes’ 2010 typology, updated to the state of 12/2018, see Annex A3 of my thesis (in German)
km_centre NUMERIC distance to the next centre in km (median of populated 100m grid cells within the municipality)
ptx NUMERIC public transport service index (no unit; median of populated 100m grid cells within the municipality) see Aberle et al. 2025 section 3.2.2, inspired by Delbosc&Currie (2011)
ptx_cap NUMERIC as above, but the median of per-capita ptx of all populated 100m grid cells within the municipality (i.e. for each cell, ptx was divided by number of residents)
t1 NUMERIC fare accessibility on a €1.70 budget (where applicable; median of populated 100m grid cells within the municipality) see Aberle & Gertz 2025
t2 NUMERIC fare accessibility on a €2.30 budget (median of populated 100m grid cells within the municipality) as above
t2sum NUMERIC fare accessibility on a €2.30 budget, absolute and weighted for Lorenz curves (I summed up all weighted and non-ln’ed destinations across 15 categories e.g. 0.19 · grocery store count + 0.15 · doctors count …) the weights can be found in Aberle & Gertz 2025, table 3
ttime INT travel time to the next destination (minutes; weighted average across 15 categories; median of populated 100m grid cells within the municipality), see section 3.2 of my thesis (in German)
standardized variables
km_centre_zt NUMERIC as above, normalised to MEAN=0 and SD=1 Wikipedia
ptx_zt NUMERIC -”- -”-
ptx_cap_zt NUMERIC -”- -”-
t1_zt NUMERIC -”- -”-
t2_zt NUMERIC -”- -”-
ttime_zt NUMERIC -”- -”-
geom GEOMETRY (MULTIPOLYGON) note that HVV and VBB have different SRIDs

500m grid level

This level contains data for populated grid cells in the HVV and VBB service areas.
For aggregates and violin plots of the input variables, see the model reports (HVV_Grid.html and VBB_Grid.html).

Column Data Type Explanation Link / Source
gitter_id_500m VARCHAR INSPIRE 500 grid id, primary key EU INSPIRE directive
ur_int INT case area: 1 = Hamburg / 2 = Berlin / 3 = HVV outside Hamburg / 4 = VBB outside Berlin
outlier INT 1 = grid cell is an outlier of the whole data set (beyond ± 1.5 IQR) Wikipedia, section 3.3 of my thesis (in German)
outlier_u INT as above, for the urban subset as above
outlier_r INT as above, for the rural subset as above
rs17 INT spatial type of the municipality according to the federal RegioStaR typology (mode of populated 100m grid cells within the 500m grid cell, i.e. the value that appeared most often) RegioStaR handbook
rs_class INT spatial class: 1 = urban / 0 = rural based on the RegioStaR typology, within Berlin and Hamburg based on local typologies, see section 3.1 of my thesis (in German)
rs_hh VARCHAR spatial type of the Statistisches Gebiet: c = city (central business district) / i = inner town / z = in-between zone (“Zwischenzone”) / s = fringe (“Stadtrand”) / g = industrial (“Gewerbe”) / l = rural (“Ländliches Hamburg”; not to confuse with the rural value of rs_class!) This attribute is only available for Hamburg in the HVV data set. For Berlin, the urban/rural information is simply coded within the rs_class attribute). based on Gesa Matthes’ 2010 typology, updated to the state of 12/2018, for details see table in Annex A3 of my thesis (in German)
km_centre NUMERIC distance to the next centre in km (median of populated 100m grid cells within the 500m grid cell)
ptx NUMERIC public transport service index (no unit; median of populated 100m grid cells within the 500m grid cell) see Aberle et al. 2025 section 3.2.2, inspired by Delbosc&Currie (2011)
ptx_cap NUMERIC as above, but the median of per-capita ptx of all populated 100m grid cells within the 500m grid cell (i.e. for each cell, ptx was divided by number of residents)
t1 NUMERIC fare accessibility on a €1.70 budget (where applicable; median of populated 100m grid cells within the 500m grid cell) see Aberle & Gertz 2025
t2 NUMERIC fare accessibility on a €2.30 budget (median of populated 100m grid cells within the 500m grid cell) as above
t2sum NUMERIC fare accessibility on a €2.30 budget, absolute and weighted for Lorenz curves (I summed up all weighted and non-ln’ed destinations across 15 categories e.g. 0.19 · grocery store count + 0.15 · doctors count …) the weights can be found in Aberle & Gertz 2025, table 3
ttime INT travel time to the next destination (minutes; weighted average across 15 categories; median of populated 100m grid cells within the 500m grid cell), see section 3.2 of my thesis (in German)
standardized variables
km_centre_zt NUMERIC as above, normalised to MEAN=0 and SD=1 Wikipedia
ptx_zt NUMERIC -”- -”-
ptx_cap_zt NUMERIC -”- -”-
t1_zt NUMERIC -”- -”-
t2_zt NUMERIC -”- -”-
ttime_zt NUMERIC -”- -”-
geom GEOMETRY (POLYGON) note that HVV and VBB have different SRIDs

If you use the data, or have a question, please drop me a line
at christoph [at] fluegelrad [dot] net.

Have fun with the data, enjoy your ride. Geld allein macht auch nicht glücklich. Aber irgendwie schon besser, im Taxi zu weinen als im HVV-Bus.

                                                                                     
                                                                 % %               
                                                              #     *              
                                                           /.      &                
                                                    # & /,       * 
                                                    ,       *           
                                                    *        &
                                          _. **  &        &             
                                       (                 %&
                                       #                  &%
                                       &
                          .%                                   &
                                 *#&&&                                  
                      &                                         %
                     &                                          (
   %                &                                        %                             
 %  &             ,                                        /                               
 ,   #(         (.                                          #                             
      &                                                   #                              
 &     &  */                                              &                              
     & #,                                           .& .#&#%#
&                                                     &                                      
.                                                   *                                      
                                                    .(                                      
&                                                       / .         
     &                                                   #                                  
   . .                                                  ( .(/ %                          
   %           ________________                               &                          
    &         |                |                               /                          
      &       | Made with love |                               .                       
         ,    |    at  TUHH    |                                 &                     
         &    |________________|                                   &                     
         %                     |                                      &                    
         &                     |                                       (                   
          ,&                   O        * #                              &                
              %   &              (  #                                   / &.&        
                & # %        & .%&   &                                   # ./         
                   #          &      (                                       /         
                  &%    .      *     %                                   &            
                      & (*. (&         &                             .&               
                     % %   /.(           %                       (                   
                                                  #              .                    
                                                    /           &                     
                                                      #&  .&%,.#*