What lurks inside that digital filing cabinet? DataGravity knows by Heather Clancy @FortuneMagazine February 20, 2015, 9:35 AM EST E-mail Tweet Facebook Google Plus Linkedin Share icons When it comes to data management, many startups accentuate the negative, warning of the scary things that can happen to businesses that don’t control access. DataGravity’s pitch is more nuanced. Yes, its technology protects companies against potentially embarrassing or costly information exposure. It also helps identify files buried within digital file cabinets that maybe should be shared more readily—automatically indexing, securing and managing files as they are saved based on potential compliance concerns. Most businesses, especially smaller ones, don’t typically have the time or resources to do this. “It’s really about harvesting the information in your humanly generated data, making sure you’re leveraging the positives and protecting yourself against the negatives,” said the New Hampshire startup’s co-founder and CEO, Paula Long. DataGravity calls this strategy “data-aware” storage; it has raised $92 million in venture funding from Accel Partners, Andreessen Horowitz, Charles River Ventures, and General Catalyst Partners to tell its story. Long and her co-founder John Joseph should have little difficulty finding an audience: their previous startup, EqualLogic, sold to Dell for $1.4 billion. Fortune spoke with Long about what sets DataGravity apart and what’s driving early adoption of its Discovery Series product line, which shipped commercially in October 2014. Following are interview excerpts, edited for clarity and length. Define ‘data-aware’ storage. Storage technology really is, for lack of a better word, just a bunch of bookcases to store information. I used to say ‘containers,’ but now I use ‘bookcases’ or ‘milk crates’ or ‘file cabinets’ with no organization, except for minimal alphabetical organization. What happens today is all your data is stored there, but you don’t know anything about it. You don’t what’s helpful, and you don’t know what’s scary. Storage is a big, black box. Data-aware storage is about taking that file cabinet full of stuff and then finding the good stuff and leveraging it to make the company more successful; or finding the scary stuff and containing it. Why aren’t businesses doing this already? Midsize companies are just starting to figure out how to do this, because a lot of data privacy and data governance issues are being raised. They know they’re exposed, and so they’re trying to figure out, “How are we gonna keep track of our data?” And then they see cases where people have left companies, taking along massive data dumps. How do you know if somebody’s just read everything and written it to a thumbnail drive or uploaded it to Dropbox or something on their way out the door? Right now, it’s hard to tell. You mentioned this is for humanly generated data. What does that mean? We actually handle 400 different types of unstructured data—it’s your Office files or your Macintosh files or your Excel files or your [computer-aided design] files. We can look at the metadata within your videos and images, but we’re not actually looking inside your videos and images or texts, right now. We could do that in the future. Our technology can tell who’s reading and writing any particular file. Your first products shipped in October 2014. Where are you finding early success? In legal, in accounting, in small financials, in state and county governments—because they really do have to keep track and make sure their content is secure. We also are finding some success in engineering companies and retail. What’s top of mind for 2015? This may sound obvious, but I really believe you should know what’s in your data, and you shouldn’t have to pay extra to know. It is the storage array’s responsibility to tell you. It’s absurd that a business should have create all this elaborate infrastructure just to figure this out. Why aren’t you just a software company? There’s this concept of primary storage, and where you’re reading and writing your data. What companies do today is try to index that primary storage while they are using it, which can bring performance to its knees. At the same time, they might be trying to figure out what data has changing and why someone is using it. After that index, however, you lose the trail. Sometimes, you can’t return to certain files because they were overwritten. They no longer exist. We deliberately decided not to put the burden on the primary storage; we wanted to be able to index [data] without impacting day-to-day work. That’s why our technology needs to include both hardware and software. What advice would you give to other business technology entrepreneurs, particularly as pressure rises to produce billion-dollar valuations? My advice is to be 100% focused on happy customers. Don’t get caught up in, “I made this much,”or “My valuation is this,” or “My valuation is that.” Frankly, that’s all very meaningless. The meaningful metric is happy customers. And then the next meaningful metric is happy customers who continue to buy and want to actually refer you to other soon-to-be-happy customers. None of these customers want you to go out of business, so they want you to make them happy at a reasonable profit without gouging them. That’s why we’re not charging a premium for all the additional features we’ve added [to our storage products]. We believe that if you bought the house, you ought to be able to go inside and find out what the rooms look like. Philosophically, the same should be true of data storage technology. This item first appeared in the Feb. 20 edition of Data Sheet, Fortune’s daily newsletter on the business of technology. Sign up here.