Fixing the .html.html url problem in Magento
A while ago we took over a Magento 1 shop that was migrated over to Magento 2. Some products and categories has a “.html.html” url postfix and the merchant wanted to have this fixed. When data was migrated from Magento 1 shop, the url keys came with ".html" suffix. On Magento 2, you have url_key and url_path attributes that help creating the urls for products and categories. Since the url_key attribute contained the “.html” postfix and url_path appended an additional “.html” postfix, the source of the problem was found.
In addition to that, in Magento 2.1 the url_path attribute got deprecated, so ideally, we would remove it and rebuild the url_keys without the “.html” postfix. First, we deleted the url_path attribute for both, products and categories:
DELETE FROM catalog_category_entity_varchar WHERE attribute_id = (SELECT attribute_id FROM eav_attribute WHERE attribute_code = 'url_path' AND entity_type_id = 3);
DELETE FROM catalog_product_entity_varchar WHERE attribute_id = (SELECT attribute_id FROM eav_attribute WHERE attribute_code = 'url_path' AND entity_type_id = 4);
Then we deleted all autogenerated rewrite_urls for both products and categories:
DELETE FROM url_rewrite WHERE (entity_type = "product" OR entity_type = "category") and is_autogenerated = 1;
And finally, we let elgentos/regenerate-catalog-urls module do its job and regenerate all those URLs:
bin/magento regenerate:product:url
bin/magento regenerate:category:url
During that process, we got a few “Duplicated url” warnings, which meant that some products have not had unique names and thus had no unique url_keys. After the merchant made some changes, everything was working fine again.