One of the biggest challenges associated with protecting endangered animals in tropical rainforests and elsewhere is documenting the numbers and gaining accurate information about conditions. Without automated cameras to capture images, there’s no way to know what’s going on. Yet, sorting through hundreds of thousands of images—sometimes even millions—is a next-to-impossible task.
“The challenges and resource demands can be enormous,” reports Jorge Ahumada, executive director of the Team Network at Conservation International, headquartered in Arlington, Va. The 27-year-old organization focuses on 250 species of mammals and birds in 15 countries in tropical areas. These animals range from Golden Cats in Uganda to Asian elephants in Cambodia.
“We set up thousands of cameras in order to monitor conditions and capture data about what is taking place,” Ahumada says. Typically, the cameras are used at a location for about 30 days, and they snap between 20,000 and 40,000 images.
Afterward, personnel in the field retrieve memory cards and upload the data. The software extracts exchangeable image file format (EXIF) data that produces time-stamped records for analysis.
The system generates a 1 or 0 based on whether a species is captured in an image on a particular day. The organization currently holds about 2.5 million images, and the number grows daily. Overall, this equals upward of 4 terabytes of data.
“The technical challenge is analyzing the huge volume of data collected by all the cameras and sensors and obtaining accurate animal counts,” Ahumada explains. “In the past, we faced a huge hurdle trying to process all the data with a limited IT infrastructure.” In fact, in many cases, the staff had to tackle the task manually.
Taking Image and Data Processing to a Higher Level
To resolve these challenges, Conservation International turned to Hewlett-Packard, now HP Enterprise (HPE), to take its image and data processing capabilities to a higher level. The environmental organization uses a custom software solution, Wildlife Picture Index Analytic System, to sort through all the data, identify patterns and generate statistical models. It is build on top of a Vertica Systems analytic database.
The objective is to use statistical modeling to generate animal population data that’s based on the “geometric mean” of specific species. (The “geometric mean” is a type of average where the numbers are multiplied and the square root or cube root is taken.) “It’s an approach heavily focused on data science,” Ahumada says.
The result, says Eric Fegraus, senior director of technology and external relations for the Team Network at Conservation International, is a 30x improvement in processing speed. “We are able to sort through the data and get to meaningful results much faster,” he says.
Currently, the organization uses a single dashboard to conduct simulations and explore visualizations in order to better understanding trends and conditions. It also aids policymakers and others in developing and managing programs.
The data-driven approach has produced valuable results. “In the past, there had been a lot of debate among conservation biologists and others about whether there are actual results in protected areas,” Ahumada says. “Because we use a science-based approach, the data is critical for demonstrating that protected areas actually work.”
The organization is now looking to incorporate image-recognition software to further advance the capabilities of the technology. “We now have an IT infrastructure that allows us to put resources to use far more effectively and accomplish our mission,” Fegraus reports.