I’ve worked professionally with data for 20 years. Today I am working with literally the worst data set I have ever come across. It’s published every week by CQC. There are formatting errors or typos in either the filename or the headers every other week. Sometimes the order of the headers or the headers that are included change. It is virtually impossible to automate. For example: https://www.diffchecker.com/3Lhlr6SF/

Leave a Reply

Your email address will not be published. Required fields are marked *

To respond on your own website, enter the URL of your response which should contain a link to this post's permalink URL. Your response will then appear (possibly after moderation) on this page. Want to update or remove your response? Update or delete your post and re-enter your post's URL again. (Find out more about Webmentions.)