Dubai Land Department (DLD) - Dataset EDA¶

Source: Dubai Pulse Open Data - DLD Transactions and Rent Contracts
Files: Transactions.csv, Rent_Contracts.csv
Purpose: Familiarisation with the raw dataset - structure, coverage, distributions, and data quality notes before any modelling or analysis.


Setup¶

1. Transactions Dataset¶

The transactions file records every registered property sale in Dubai. Each row is one transaction. We load the full file first and inspect it before any filtering.

Shape: 1,253,267 rows  x  46 columns
transaction_id procedure_id trans_group_id trans_group_ar trans_group_en procedure_name_ar procedure_name_en instance_date property_type_id property_type_ar property_type_en property_sub_type_id property_sub_type_ar property_sub_type_en property_usage_ar ... nearest_metro_ar nearest_metro_en nearest_mall_ar nearest_mall_en rooms_ar rooms_en has_parking procedure_area actual_worth meter_sale_price rent_value meter_rent_price no_of_parties_role_1 no_of_parties_role_2 no_of_parties_role_3
0 1-11-2004-2023 11 1 مبايعات Sales بيع Sell 27-09-2004 1 أرض Land NaN NaN NaN سكني ... محطة مترو بنك أبوظبي التجاري ADCB Metro Station مول دبي Dubai Mall NaN NaN 0 1904.88 4101000.00 2152.89 NaN NaN 14.00 1.00 0.00
1 2-13-2008-381 13 2 رهون Mortgages تسجيل رهن Mortgage Registration 06-03-2008 4 فيلا Villa NaN NaN NaN أخرى ... محطة مترو بنك أبوظبي التجاري ADCB Metro Station مول دبي Dubai Mall NaN NaN 0 896.61 3000000.00 3345.94 NaN NaN 1.00 1.00 0.00
2 3-9-2006-300097 9 3 هبات Gifts هبه Grant 17-07-2006 4 فيلا Villa NaN NaN NaN سكني ... محطة مترو الجافلية Al Jafiliya Metro Station مول دبي Dubai Mall NaN NaN 0 341.81 1199971.00 3510.64 NaN NaN 1.00 1.00 0.00

3 rows × 46 columns

<class 'pandas.DataFrame'>
RangeIndex: 1253267 entries, 0 to 1253266
Data columns (total 46 columns):
 #   Column                Non-Null Count    Dtype  
---  ------                --------------    -----  
 0   transaction_id        1253267 non-null  str    
 1   procedure_id          1253267 non-null  int64  
 2   trans_group_id        1253267 non-null  int64  
 3   trans_group_ar        1253267 non-null  str    
 4   trans_group_en        1253267 non-null  str    
 5   procedure_name_ar     1253267 non-null  str    
 6   procedure_name_en     1253267 non-null  str    
 7   instance_date         1253267 non-null  str    
 8   property_type_id      1253267 non-null  int64  
 9   property_type_ar      1253267 non-null  str    
 10  property_type_en      1253267 non-null  str    
 11  property_sub_type_id  975897 non-null   float64
 12  property_sub_type_ar  975897 non-null   str    
 13  property_sub_type_en  975897 non-null   str    
 14  property_usage_ar     1253267 non-null  str    
 15  property_usage_en     1253267 non-null  str    
 16  reg_type_id           1253267 non-null  int64  
 17  reg_type_ar           1253267 non-null  str    
 18  reg_type_en           1253267 non-null  str    
 19  area_id               1253267 non-null  int64  
 20  area_name_ar          1253267 non-null  str    
 21  area_name_en          1253267 non-null  str    
 22  building_name_ar      866918 non-null   str    
 23  building_name_en      867318 non-null   str    
 24  project_number        846489 non-null   float64
 25  project_name_ar       846489 non-null   str    
 26  project_name_en       846489 non-null   str    
 27  master_project_en     1030218 non-null  str    
 28  master_project_ar     1030168 non-null  str    
 29  nearest_landmark_ar   1071296 non-null  str    
 30  nearest_landmark_en   1071296 non-null  str    
 31  nearest_metro_ar      938194 non-null   str    
 32  nearest_metro_en      938194 non-null   str    
 33  nearest_mall_ar       933053 non-null   str    
 34  nearest_mall_en       933053 non-null   str    
 35  rooms_ar              957141 non-null   str    
 36  rooms_en              957141 non-null   str    
 37  has_parking           1253267 non-null  int64  
 38  procedure_area        1253267 non-null  float64
 39  actual_worth          1253267 non-null  float64
 40  meter_sale_price      1253267 non-null  float64
 41  rent_value            34794 non-null    float64
 42  meter_rent_price      34794 non-null    float64
 43  no_of_parties_role_1  1252341 non-null  float64
 44  no_of_parties_role_2  1252341 non-null  float64
 45  no_of_parties_role_3  1252341 non-null  float64
dtypes: float64(10), int64(6), str(30)
memory usage: 439.8 MB

1.1 Missing values¶

Missing Pct %
meter_rent_price 1218473 97.20
rent_value 1218473 97.20
project_name_ar 406778 32.50
project_number 406778 32.50
project_name_en 406778 32.50
building_name_ar 386349 30.80
building_name_en 385949 30.80
nearest_mall_en 320214 25.60
nearest_mall_ar 320214 25.60
nearest_metro_en 315073 25.10
nearest_metro_ar 315073 25.10
rooms_en 296126 23.60
rooms_ar 296126 23.60
property_sub_type_en 277370 22.10
property_sub_type_id 277370 22.10
property_sub_type_ar 277370 22.10
master_project_ar 223099 17.80
master_project_en 223049 17.80
nearest_landmark_ar 181971 14.50
nearest_landmark_en 181971 14.50
no_of_parties_role_1 926 0.10
no_of_parties_role_2 926 0.10
no_of_parties_role_3 926 0.10

1.2 Transaction types¶

The trans_group_en column distinguishes sales, mortgages, gifts, and other transaction types.

Count
trans_group_en
Sales 944983
Mortgages 262422
Gifts 45862

1.3 Annual transaction volume (2000 onward)¶

Years before 2000 have very sparse and inconsistent entries. We restrict the time axis to 2000–2025 for a clean view of the modern market.

No description has been provided for this image

1.4 Property type breakdown¶

--- property_type_en ---
Count
property_type_en
Unit 867318
Villa 247338
Land 104854
Building 33757
--- property_sub_type_en ---
Count
property_sub_type_en
Flat 752822
Villa 108357
Office 61053
Hotel Apartment 24672
Shop 14043
Hotel Rooms 13119
Workshop 515
Stacked Townhouses 439
Store 319
Building 225
--- property_usage_en ---
Count
property_usage_en
Residential 1022134
Commercial 158182
Hospitality 37854
Other 27213
Industrial 4149
Multi-Use 1971
Agricultural 1080
Storage 655
Residential / Commercial 29
--- reg_type_en ---
Count
reg_type_en
Existing Properties 876328
Off-Plan Properties 376939

1.5 Room type distribution¶

No description has been provided for this image

1.6 Sale price per m² distribution¶

Clipped at the 95th percentile to remove extreme commercial and land transaction values, giving a readable view of the residential price range.

Count (before clip):  1,253,262
Count (after clip):   1,190,598
Clip threshold:       AED 28,545 / m²
Mean:   AED 10,738 / m²
Median: AED 9,671 / m²
No description has been provided for this image

1.7 Unit area distribution¶

Clipped at 500 m² to focus on residential units. The vast majority of flats and apartments fall well below this threshold; values above represent villas, plots, and commercial units.

Count (before clip):  1,253,267
Count (after clip):   1,052,105  (83.9% of total)
No description has been provided for this image

1.8 Top areas and projects¶

--- Top 15 areas ---
Transactions
area_name_en
Marsa Dubai 118929
Business Bay 88294
Al Thanyah Fifth 84158
Al Barsha South Fourth 71289
Burj Khalifa 62300
Al Warsan First 53377
Jabal Ali First 43916
Palm Jumeirah 39104
Al Hebiah Fourth 37347
Wadi Al Safa 5 36569
Al Merkadh 31243
Al Thanyah Third 31079
Hadaeq Sheikh Mohammed Bin Rashid 30751
Al Thanayah Fourth 29651
Nadd Hessa 26842
--- Top 15 projects ---
Transactions
project_name_en
REMRAAM 10281
SKY COURTS 9663
JUMEIRAH PARK 6558
INTERNATIONAL CITY EMARATI 4775
VICTORY HEIGHTS 4107
LAKESIDE 3842
CHURCHILL TOWER 3706
AL KHAIL HEIGHTS 3430
LAGO VISTA 3245
DAMAC TOWERS BY PARAMOUNT 3232
MARINA RESIDENCE 3189
SEVEN CITY JLT 3186
BURJ KHALIFA TOWERS 3168
TOWN SQUARE ZAHRA 3118
PARK ISLANDS 3094

1.9 Median sale price per m² over time (2000–2025)¶

No description has been provided for this image

2. Rent Contracts Dataset¶

The Ejari rent contracts file records all registered tenancy agreements. We inspect the raw file before any filtering.

Shape: 7,111,733 rows  x  40 columns
contract_id contract_reg_type_id contract_reg_type_ar contract_reg_type_en contract_start_date contract_end_date contract_amount annual_amount no_of_prop line_number is_free_hold ejari_bus_property_type_id ejari_bus_property_type_ar ejari_bus_property_type_en ejari_property_type_id ... master_project_ar master_project_en area_id area_name_ar area_name_en actual_area nearest_landmark_ar nearest_landmark_en nearest_metro_ar nearest_metro_en nearest_mall_ar nearest_mall_en tenant_type_id tenant_type_ar tenant_type_en
0 CRT1012981266 1 جديد New 07-04-2019 06-04-2020 85000 85000 1 1 1 2 وحدة Unit 2.00 ... الخليج التجاري Business Bay 526 الخليج التجارى Business Bay 140.00 وسط مدينة دبي Downtown Dubai محطة مترو بوج خليفة دبي مول Buj Khalifa Dubai Mall Metro Station مول دبي Dubai Mall 1.00 شخص Person
1 CRT1012983196 1 جديد New 20-04-2019 19-04-2020 110000 110000 1 1 1 4 فيلا Villa 841.00 ... قرية جميرا المثلثة Jumeirah Village Triangle 442 البرشاء جنوب الخامسة Al Barsha South Fifth 734.00 أكاديمية المدينة الرياضية للسباحة Sports City Swimming Academy محطة مترو النخيل Nakheel Metro Station مارينا مول Marina Mall 1.00 شخص Person
2 CRT1012984226 1 جديد New 11-04-2019 10-04-2020 100000 100000 1 1 1 4 فيلا Villa 841.00 ... NaN NaN 506 اليلايس 1 Al Yelayiss 1 324.00 دورة دبي للدراجات Dubai Cycling Course NaN NaN NaN NaN 1.00 شخص Person

3 rows × 40 columns

<class 'pandas.DataFrame'>
RangeIndex: 7111733 entries, 0 to 7111732
Data columns (total 40 columns):
 #   Column                      Dtype  
---  ------                      -----  
 0   contract_id                 str    
 1   contract_reg_type_id        int64  
 2   contract_reg_type_ar        str    
 3   contract_reg_type_en        str    
 4   contract_start_date         str    
 5   contract_end_date           str    
 6   contract_amount             int64  
 7   annual_amount               int64  
 8   no_of_prop                  int64  
 9   line_number                 int64  
 10  is_free_hold                int64  
 11  ejari_bus_property_type_id  int64  
 12  ejari_bus_property_type_ar  str    
 13  ejari_bus_property_type_en  str    
 14  ejari_property_type_id      float64
 15  ejari_property_type_en      str    
 16  ejari_property_type_ar      str    
 17  ejari_property_sub_type_id  float64
 18  ejari_property_sub_type_en  str    
 19  ejari_property_sub_type_ar  str    
 20  property_usage_en           str    
 21  property_usage_ar           str    
 22  project_number              float64
 23  project_name_ar             str    
 24  project_name_en             str    
 25  master_project_ar           str    
 26  master_project_en           str    
 27  area_id                     int64  
 28  area_name_ar                str    
 29  area_name_en                str    
 30  actual_area                 float64
 31  nearest_landmark_ar         str    
 32  nearest_landmark_en         str    
 33  nearest_metro_ar            str    
 34  nearest_metro_en            str    
 35  nearest_mall_ar             str    
 36  nearest_mall_en             str    
 37  tenant_type_id              float64
 38  tenant_type_ar              str    
 39  tenant_type_en              str    
dtypes: float64(5), int64(8), str(27)
memory usage: 2.1 GB

2.1 Missing values¶

Missing Pct %
project_name_en 6040500 84.90
project_name_ar 6040500 84.90
project_number 6040500 84.90
master_project_ar 4602506 64.70
master_project_en 4602490 64.70
nearest_mall_ar 841530 11.80
nearest_mall_en 841530 11.80
nearest_metro_ar 776980 10.90
nearest_metro_en 776980 10.90
tenant_type_en 758070 10.70
tenant_type_ar 758070 10.70
tenant_type_id 758070 10.70
nearest_landmark_ar 504240 7.10
nearest_landmark_en 504240 7.10
actual_area 142885 2.00
ejari_property_sub_type_en 60697 0.90
ejari_property_sub_type_ar 60697 0.90
ejari_property_sub_type_id 55853 0.80
ejari_property_type_en 54628 0.80
ejari_property_type_ar 54628 0.80
ejari_property_type_id 53519 0.80
property_usage_en 10949 0.20
property_usage_ar 10949 0.20

2.2 Contract and property type breakdown¶

--- contract_reg_type_en ---
Count
contract_reg_type_en
New 3624049
Renew 3487684
--- ejari_bus_property_type_en ---
Count
ejari_bus_property_type_en
Unit 6518338
Villa 536520
Land 53519
Building 3356
--- ejari_property_type_en ---
Count
ejari_property_type_en
Flat 4157051
Office 786665
Shop 731861
Labor Camps 527968
Villa 480489
Warehouse 85368
Studio 70075
Hotel 51386
--- ejari_property_sub_type_en ---
Count
ejari_property_sub_type_en
1bed room+Hall 1605137
2 bed rooms+hall 1535866
Studio 818154
Office 771676
Shop 674592
3 bed rooms+hall 556293
Room in labor Camp 519487
4 bed rooms+hall 189635
--- property_usage_en ---
Count
property_usage_en
Residential 5270287
Commercial 1750968
Industrial 29019
Industrial / Commercial 25583
Multi Usage 9573
Industrial / Commercial / Residential 7076
Storage 2972
Tourist origin 1972

2.3 Annual contract volume (2010–2025)¶

Rent contract registration on Ejari became systematic from around 2010. We restrict the axis to 2010–2025 to avoid showing spurious future dates from data entry errors.

2.4 Annual rent distribution¶

Clipped at 95th percentile and with a minimum of AED 5,000 to remove entry errors. This focuses the view on the realistic residential rental range.

Count (after floor):  7,093,787
Count (after clip):   6,739,097
Clip threshold:       AED 1,299,816
Mean:   AED 115,925
Median: AED 61,600
No description has been provided for this image

2.5 Unit area distribution (rent contracts)¶

Clipped at 500 m² matching the sales dataset, focusing on the residential unit range.

Count (before clip):  6,844,081
Count (after clip):   6,505,600  (95.1% of total)
No description has been provided for this image

2.6 Median annual rent per m² over time (2010–2025)¶

Computed from valid records only (area > 0, date within range). The rent per m² metric normalises for unit size differences across years.

No description has been provided for this image

2.7 Top areas by contract volume¶

--- Top 15 areas (rent contracts) ---
Contracts
area_name_en
Al Warsan First 302247
Jabal Ali First 259119
Naif 211797
Al Karama 208099
Marsa Dubai 196764
Jabal Ali Industrial First 194143
Business Bay 170394
Al Nahda Second 169891
Al Mararr 163685
Nadd Hessa 158552
Al Suq Al Kabeer 158142
Al Barsha First 157948
Al Goze Industrial Second 152263
Al Murqabat 146972
Mirdif 145465

3. Joint Overview - Sales vs Rent Trends¶

Using the filtered residential segment (2–3 bed flats, 70–160 m², 2014 onward) to compare price and rent on the same time axis.

Filtered sales rows:  81,790
Filtered rent rows:   475,542
No description has been provided for this image
No description has been provided for this image
Gross Yield %
date
2014 6.67
2015 7.97
2016 7.85
2017 7.39
2018 7.62
2019 7.46
2020 6.91
2021 5.30
2022 5.00
2023 5.60
2024 6.37

4. Data Quality Notes¶

Issue Detail
Sparse early years Transactions before 2000 are very few. Consistent coverage begins around 2013-2014.
Future dates in rent file Some rent contracts have malformed dates parsing to 2030+ or beyond. Filter the date index to 2025 or earlier.
Outlier prices meter_sale_price has extreme values from land and commercial transactions. Clip at 95th percentile for residential analysis.
Unit area outliers procedure_area includes very large plots. Clip at 500 m² for residential scope.
Master project nulls A portion of transactions have no master_project_en. Excluded in project-level analysis.
Thin projects Many projects have fewer than 10 transactions. Median price from sparse data is unreliable. Filter to projects with sufficient volume.
Rent area nulls actual_area has some nulls in the rent file, preventing rent per m² calculation for those rows.
Date format Both files use DD-MM-YYYY string format. Parse with format='%d-%m-%Y' to avoid ambiguity.
Transaction types The transactions file includes sales, mortgages, and gifts. Filter to reg_type_en == 'Existing Properties' for resale analysis.

For yield estimation, ROI projections, and project-level analysis see the companion analysis notebook.