regex - File splitting using Perl -
i'm trying split large text files several text files. found thread few years ago similar premise couldn't find exact situation.
https://unix.stackexchange.com/a/64691/183674
how split following data if first line didn't start 00:00:00:00?
00:00:00:00 00:00:05:00 01sc_001.jpg 00:00:14:29 00:00:19:29 01sc_002.jpg 00:01:07:20 00:01:12:20 01sc_003.jpg 00:00:00:00 00:00:03:25 02mi_001.jpg 00:00:03:25 00:00:08:25 02mi_002.jpg 00:00:35:27 00:00:40:27 02mi_003.jpg 00:00:00:00 00:00:05:00 03bi_001.jpg 00:00:05:19 00:00:10:19 03bi_002.jpg 00:01:11:17 00:01:16:17 03bi_003.jpg 00:00:00:00 00:00:05:00 04cg_001.jpg 00:00:11:03 00:00:16:03 04cg_002.jpg 00:01:12:25 00:01:17:25 04cg_003.jpg
here's code reference:
#!/usr/bin/env perl use strict; use warnings; open(my $infh, '<', 'abc_tabdelim.txt') or die $!; $outfh; $filecount = 0; while ( $line = <$infh> ) { if ( $line =~ /^00:00:00:00/ ) { close($outfh) if $outfh; open($outfh, '>', sprintf('abc%02d_tabdelim.txt', ++$filecount)) or die $!; } print {$outfh} $line or die "failed write file: $!"; } close($outfh); close($infh);
i tried adding print $line;
in next line after while statement attempt make read line line shown in other tutorials did not rectify issue.
i appreciate input.
edit: example like
00:01:16:17 00:00:05:00 01sc_001.jpg 00:00:14:29 00:00:19:29 01sc_002.jpg 00:01:07:20 00:01:12:20 01sc_003.jpg 00:00:00:00 00:00:03:25 02mi_001.jpg 00:00:03:25 00:00:08:25 02mi_002.jpg 00:00:35:27 00:00:40:27 02mi_003.jpg 00:00:00:00 00:00:05:00 03bi_001.jpg 00:00:05:19 00:00:10:19 03bi_002.jpg 00:01:11:17 00:01:16:17 03bi_003.jpg 00:00:00:00 00:00:05:00 04cg_001.jpg 00:00:11:03 00:00:16:03 04cg_002.jpg 00:01:12:25 00:01:17:25 04cg_003.jpg
i 3 seperate files, respectively containing
00:00:00:00 00:00:03:25 02mi_001.jpg 00:00:03:25 00:00:08:25 02mi_002.jpg 00:00:35:27 00:00:40:27 02mi_003.jpg 00:00:00:00 00:00:05:00 03bi_001.jpg 00:00:05:19 00:00:10:19 03bi_002.jpg 00:01:11:17 00:01:16:17 03bi_003.jpg 00:00:00:00 00:00:05:00 04cg_001.jpg 00:00:11:03 00:00:16:03 04cg_002.jpg 00:01:12:25 00:01:17:25 04cg_003.jpg
discarding first 3 lines.
does modifying condition in loop not job?
if ($line =~ /^00:00:00:00/ || !$outfh)
suppose first line not start 00:00:00:00
(a 'zero marker'). regex match fails, file isn't open || !$outfh
condition true. code in if
body skips close , opens new file , line written new file. thereafter, file open, second half of condition doesn't change decision making (except slow down marginally , immeasurably).
the question clarified since first proffered solution. if want discard rows before first 0 marker, modify print print if file handle open (instead of modified condition open file if first line not start 0 marker).
print $outfh $line or die "failed write file: $!" if $outfh;
Comments
Post a Comment