The Technology Innovation Institute (TII) is a research institute based in Abu Dhabi, United Arab Emirates. It is best known in AI for the Falcon family of open-weight large language models, which it describes as part of a mission to make advanced AI accessible globally. TII’s entry put the UAE on the map as a serious open-model builder at a time when frontier open weights came almost exclusively from US, European, and Chinese organizations.
TII’s signature technical contribution alongside the models themselves is the RefinedWeb dataset, documented in a June 2023 paper. RefinedWeb argued that properly filtered and deduplicated web data alone - with no curated books or academic corpora - could train models that outperformed ones trained on curated datasets like The Pile. TII extracted roughly five trillion tokens from Common Crawl for Falcon’s training and publicly released a 600-billion-token slice. The Falcon models that followed, including the 40-billion- and 180-billion-parameter versions, were released under permissive Apache 2.0-based licensing.
Why business readers should care: TII shows that frontier-scale open models can come from state-backed research institutes outside the usual AI centers, broadening the supply of weights organizations can self-host and reducing dependence on a handful of US labs.