It doesn't look like it is possible to keep the same folder structure currently when outputting scraped information. For example, if the input query text file looks something like this: cars cars/ford cars/ford/focus/ cars/ford/focus/red cars/porsche cars/porsche/carrera cars/porsche/carrera/black cars/porsche/carrera/black/new cars/porsche/carrera/black/used cars/nissan cars/nissan/xtrail dogs/ dogs/bulldog/black dogs/labrador/golden dogs/labrador/golden/large dogs/labrador/brown/large ... Is it possible to maintain the same structure in the output files? Also, how hard would this be to output all of the scraped information straight into a sql file so a databse of the information could be created (again keeping the same structure)? Thank you.
In this case, use $query in the file name format, for example: Code: $query/$datefile.format().txt This will create the folder structure: You can use result format like Code: INSERT INTO blah VALUES('$query', '$p1.pr', ...)\n
Thanks for the reply. I should of made it clear that my input file isn't in a folder structure and that regex is used to extract the information needed from the input file. So my input text file is actually something like: Bedroom (#1) Bedroom (#1) Bedding (#2) Bedroom (#1) Bedding (#2) Bed Pillows (#20445) Bedroom (#1) Bedding (#2) Bed Skirts (#20450) Bedroom (#1) Bedding (#2) Bed-in-a-Bag (#20469) Bedroom (#1) Bedding (#2) Blankets & Throws (#175750) Bedroom (#1) Bedding (#2) Canopies & Netting (#48090) Bedroom (#1) Bedding (#2) Comforters & Sets (#45462) Bedroom (#1) Bedding (#2) Decorative Bed Pillows (#115630) Bedroom (#1) Bedding (#2) Duvet Covers & Sets (#37644) Bedroom (#1) Bedding (#2) Mattress Pads & Feather Beds (#175751) Bedroom (#1) Bedding (#2) Other Bedding (#25815) Bedroom (#1) Bedding (#2) Pillow Shams (#43397) Bedroom (#1) Bedding (#2) Quilts, Bedspreads & Coverlets (#175749) I know currently it's not achievable when using regex on the input file so it's more of a request if this functionality can be added (or is there a work around) Thanks.
Then use this file name format: Code: [% query.replace('\s*\(.+?\)\s*', '/') %]$datefile.format().txt
Thank you, I will try and let you know how I get on. I sent you a PM with the actual file I will be using as the input query text file in case the example I have typed up above is slightly different. Could you let me know if there are any changes to the regular expression above please? Thank you.