HTML field hidden data

When adding content to multiple paragraph text field (HTML), we keep recieving a message that our HTML has hidden content.

image

But, checking in a website that says the size of our HTML, it says it has 278KB of full size, and in your documentation the max size allowed before appear this message is 300KB
https://www.datocms.com/product-updates/field-warnings-for-html-metadata

What can we do so the message doesn’t show up and we don’t have hidden HTML?

Hey @isabella.marcondes,

That warning is for your own protection; it tries to warn you before you reach the record size limit. Normally this sort of hidden data is caused by copying & pasting from another tool that embeds its own hidden metadata. In your case it looks like maybe CK Editor is adding a bunch of data-cke-saved-href attributes to your links and images (which does nothing on a production website). It’s also adding a lot of repeated inline style tags for things like font sizes.

Is this copied & pasted from another CMS or website, perhaps?

The quickest workaround would be to run that HTML through something like https://www.htmlwasher.com/ (not affiliated with us; you can use any “html cleaner” or “html sanitizer” you like). Websites like that will clean the HTML into just basic formatting tags. But this may be overly aggressive (such as if you rely on the inline styles for proper formatting).

You can also try one of the HTML sanitizer plugins on the DatoCMS plugins marketplace: https://www.datocms.com/marketplace/plugins/browse?s=sanitize

If none of those work, you might want to check with your web team to see if there’s some custom sanitizer solution they can make for you, like maybe a custom configuration of sanitize-html to keep the tags that you need and nothing else. It might also help to make use of cleaner CSS formatting with reusable classes instead of repeated inline styles.

Does that help? Please let me know if we can clarify anything :slight_smile: