More than three quarters of large companies today have a “data-hungry” AI initiative under way — projects involving neural networks or deep-learning systems trained on huge repositories of data. Yet, many of the most valuable data sets in organizations are quite small: Think kilobytes or megabytes rather than exabytes. Because this data lacks the volume and velocity of big data, it’s often overlooked, languishing in PCs and functional databases and unconnected to enterprise-wide IT innovation initiatives.
Small Data Can Play a Big Role in AI
For every big data set (with one billion columns and rows) fueling an AI or advanced analytics initiative, a typical large organization may have a thousand small data sets that go unused. Examples abound: marketing surveys of new customer segments, meeting minutes, spreadsheets with less than 1,000 columns and rows. As small-data techniques advance, their increased efficiency, accuracy, and transparency will increasingly be put to work across industries and business functions. Think drug discovery, industrial image retrieval, the design of new consumer products, and the detection of defective factory machine parts, and much more. But competitive advantage will come not from automation, but from the human factor. For example, as AI plays an increasingly bigger role in employee skills training, its ability to learn from smaller datasets will enable expert employees to embed their expertise in the training systems, continually improving them and efficiently transferring their skills to other workers. People who are not data scientists could be transformed into AI trainers, enabling companies to apply and scale the vast reserves of untapped expertise unique to their organizations.