Data: More is NOT Better

We are able to collect more data right now than at any point in history. Yet a high volume of data does not imply high data quality. More data does not imply better data.

LIDAR surveys are a case in point.

What on am I supposed to do with a 10 cm grid point cloud of 10 000 ha?

A huge amount of completely unnecessary detail is picked up - who is interested in the clods and molehills in the middle of a field? It is irrelevant to the design. Yet the data point clogs up my hard drive.  The brutal thinning process necessary to translate it into a digital terrain model that can actually be used in irrigation/civil design software takes time and needs continual sense-checking to make sure the useful points stay, and the noise goes.

To me, the accuracy of lidar and photogrammetric data is often in question as well. The most hilarious example of inaccurate data that I have seen is a survey that I was given to do a flood inundation model. There were too few ground control points over a long, narrow strip survey, and the survey showed a rise of 4 m in river bed level! Needless to say, the flood model was not representative of reality. I have also seen “ground level” surveys of sugar cane where the “ground level” varies by cane age - i.e. what is being picked up is not ground level.

A small amount of intelligently collected, clean data beats data spam all the time. More data does not imply a better design, often it means that the design process takes longer than it needs to due to the time requirements of cleaning and thinning  dirty, dense data.

Yes, dense data makes very impressive 3D models and visualizations, but as far as engineering usefulness goes, it is hard to beat a proper surveyor with a staff.

Don't miss these stories: