Welcome to the TBImages place where the Research Group in Pattern Recognition and Optimization makes available to the research community the databases of sputum smear microscopy images for Tuberculosis diagnosis built up over more than a decade.

The development of diagnosis tools that perform automatic detection of diagnostic finding or that identify areas of interest in medical or microscopic imaging has been the subject of research and development in recent decades due to advances in imaging equipment, and in computer vision and pattern recognition areas.

Several research has been published about Automatic detection and recognition of mycobacterium tuberculosis in bright field microscopy (K. Nurzynska, et al, 2023; T. N. Suzuki et al, 2012; M. H. Guo et al., 2022. Zaizen et al., 2022; S. Zurac et al., 2022; A. U. Ibrahim et al, 2021; K. S. Mithra and W. R. Sam Emmanuel, 2021; Y. M. Yang et al, 2020; H. Yousefi et al.  2020; M. K. M. Serrao  et al 2020; K. Swetha et al, 2020; M. El-Melegy et al 2019; C. P. Kuok et al,  2019, M. G. F. Costa et al, 2019; R. Dinesh Jackson Samuel and B. Rajesh Kanna, 2019; R. O. Panicker et al  2018; K. S. Mithra and W. R. Sam Emmanuel, 2019; Y. Xiong  et al, 2018; Y. P. Lopez et al,  2017; E. Sugirtha and G. Muruges et al, 2017; K. S. Mithra and W. R. S. Emmanuel, 2017; E. Priya and S. Srinivasan, 2016; M. I. Shah et al, 2016; J. A. Quinnet al, 2016; R. S. Soans et al 2016; Costa Filho et al, 2015; M. Rico-Garcia, et al, 2015; S. Ayas and M. Ekinci, 2014; R. Lumb et al, 2013; Costa Filho et al 2012a, Costa Filho and Costa, 2012, R. Rulaningtyas, A. B. et al. 2011; Y. Zhai, Yet al, 2010; R. Khutlang et al., 2010, Makkapat et ali,2009, Sadaphal et al, 2008; Raof et al,2008; Costa et al, 2008)

Nevertheless, many of the successful claims reported are from studies whose images used to validate the techniques did not represented the typical images captured in practical applications (images are captured under strictly controlled environment) or the number of images is too small. Thus, a database built this way can lead to results favoring one method over another. On the other hand, robust databases have the great potential to help researchers to evaluate and improve their algorithms about objects detection and recognition, and other purposes.

It is known that the performance comparison between algorithms is only possible when they are tested and validated with the same data set. In some areas, especially in new areas of research, this fact constitute an impediment because still has not been given due attention to the development of robust image databases that can be used as a benchmark to test the performance of these new algorithms.

Our  Pattern Recognition & Optimization research group from the Federal University of Amazonas  published the first results about automatic detection of Mycobacterium tuberculosis in sputum smear microscopy (COSTA et al,2008).  have built the first image database of conventional sputum smear microscopy of tuberculosis patients. 

Here we provide two sets of TB image datasets for research purposes. The first one has two subset: TBIMAGE_DB_FOCUS.V1 and TB_IMAGE_DB_BACILLI.V1,  was built in 2010 - see DATABASES menu.
More recently, we completed the construction of the second, larger dataset, consisting of (6 subsets) - see DATABASES (new) menu

Finally, it is expected to potentiate the development, in the shortest possible time, systems to support the automatic diagnosis of tuberculosis.



Tuberculosis (TB) is a highly infectious disease caused by the bacteria Mycobacterium tuberculosis. While treatable, TB remains a significant public health concern. TB is the 13th leading cause of death worldwide. In 2022, an estimated 1.3 million people died from TB, including those co-infected with HIV, and 10.6 million people fell ill with TB.. Tb is a Global Burden, Disproportionately Affecting Low and Middle-Income Countries (LMICs). Over 80% of TB cases and deaths occur in LMICs. One of Reasons for this disproportionate burden include Limited access to healthcare that delays diagnosis and treatment, leading to increased mortality.


Sputum smear microscopy (SSM) remains a crucial diagnostic tool for TB in LMICs, despite limitations, for several reasons:

  • SSM is significantly cheaper and simpler to perform compared to other diagnostic methods like GeneXpert or culture tests.

  • This makes it accessible even in resource-limited settings with limited infrastructure and budgets.

  • The equipment required for SSM is relatively basic and easy to maintain. 

  • SSM can provide results within hours, allowing for early diagnosis and treatment initiation. This is crucial for controlling the spread of the disease and improving patient outcomes.

  • SSM, while less sensitive than other methods, excels in identifying individuals with high bacterial loads who are most infectious. This is particularly important for prioritizing treatment and preventing further transmission in settings where resources are limited.

  • SSM has been widely used for decades and is well-integrated with existing TB control programs in LMICs.

However, SSM can miss a significant portion of TB cases, particularly those with low bacterial loads (low sensitivity).  Other microorganisms can mimic TB bacilli in smears, leading to false-positive results and, while simple, interpreting SSM results requires trained personnel, which might be limited in some settings.



Automatic sputum smear microscopy (ASMM) holds potential to address several limitations of traditional SSM, particularly in LMICs:

. ASMM can utilizes machine learning algorithms to analyze smears images with greater sensitivity compared to human eyes. This can potentially detect TB cases missed by traditional microscopy, especially those with low bacterial loads.
. ASMM's methods can be trained to differentiate between Mycobacterium tuberculosis and other microorganisms that mimic its appearance in smears. This can reduce the number of false-positive results and improve the specificity of the test.
. ASMM has the potential to reduces reliance on highly trained personnel for interpreting smears. the automatic methods can help minimize variability and inconsistencies in interpretation observed with manual microscopy. This can be particularly beneficial in LMICs where trained personnel might be limited.