As I tell my students when I give my presentation on web scraping:
I am not a lawyer, and more importantly I am not your lawyer...
That said, my understanding of the issue is this.
Raw data cannot be protected by copyright, though there may exist other protections, e.g. if you pay for a membership to a website which holds data, you may need to agree to terms to which you could be legally bound (generic terms of service you do not explicitly agree to would generally not be sufficiently binding).
Only the creative expression of data can be protected by copyright. The canonical example (and to the best of my understanding the case law which established the precedent is a directory of telephone numbers).
The names and numbers themselves are simply data with zero creative merit. Printing them on a page, having made decisions about font type and size, the page layout, and other things constitutes some modicum of creative expression.
So, you could, in theory, scan and OCR a telephone book (or manually re-key it as was likely done in the original case), and use that data to print your own creative expression of a telephone book. What you could not do is photocopy the pages of someone else's telephone book and sell copies.
I hope that clarifies things.
Again, this is all to my best understanding, and I welcome all corrections.
Lastly,
I am not a lawyer, and more importantly I am not your lawyer.
Best.