I have a MySql table called
twitter_statuses which as the title suggests holds twitter statuses. here is the table structure.
this table contains all sort of characters in the text column, and some of the column holds Ruby object which are multi lined.
Im trying to select some of the records into a file and then load them to a new table but because of all the special characters some of the rows get “messed up”.
here is the query I use to dump the statuses
here is the query I use to re-load the dumped statuses
As you can see I tried replacing the commas in the text column with
**** to prevent them from splitting the fields, and replaced the new line in the multi line columns (
user_mentions) to prevent “cutting the line” when dumping the file to CSV.
currently the multi line fields load great (the replacement of the new line character works) my problem id with other characters that appear in other fields (all sort of character) with cause to split columns.
any ideas on what could I change there?
I would suggest remove all those
REPLACE(text,',','****') stuff, and use
FIELDS ENCLOSED BY...:
SELECT id, status_twitter_id, twitter_id, creator_twitter_identity_id, text, in_reply_to_status_id, in_reply_to_user_id, urls, hashtags, created_at, updated_at, sent_at, user_mentions FROM twitter_statuses into outfile '/tmp/twitter_statuses.csv' FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY 'n' ;
LOAD DATA INFILE like this:
LOAD DATA LOCAL INFILE '/srv/$SPLIT_FOLDER/$f' INTO TABLE my_db.twitter_statuses_new FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY 'n' ...
This should do the work.