请教如何处理字符串?

xlzheng · 发表于 2007-1-31 15:46:50

现有这样的文件:
12 OCT 2006; Jcake willer <jacke_willer@163.com> libdeiododng-1.2.3.-r1.ebuild
added new information in this ebuild.
20 DEV 2006; Miachel Jordan <miachel_jordan@gmail.com> app-pda-2.1.10-r1.ebuild
modified some information in this ebuild.
..........
其中上面的每条信息可能占用不止一行.
我想处理一下这些信息，就是把他们前面都加上相应的字段说明，即，下面的形式：
Develope time: 12 OCT 2006; Developer: Jcake willer; Email:<jacke_willer@163.com>; Product:libdeiododng-1.2.3.-r1.ebuild
每条信息下面的说明就不需要.
由于正则表达式等实在用的不熟，所以这个信息还是不能很好的处理，希望指教，谢谢

zhy2111314 · 发表于 2007-3-2 23:53:36

Sorry，没看清楚，再看看呵呵

chunchengfh · 发表于 2007-3-3 19:42:44

脚本test.sh的内容如下：

#!/bin/sh
cat $1 | while read line
do
#用grep查找是否该行需要修改：
echo "$line" | grep -qs "[0-9] [A-Z][A-Z][A-Z] [0-9][0-9][0-9][0-9];.*<.*@.*\..*> "
if [ $? -eq 0 ]; then
time=$(echo $line | awk -F"[;<>]" '{print $1}')
name=$(echo $line | awk -F"[;<>]" '{print $2}')
email=$(echo $line | awk -F"[;<>]" '{print $3}')
software=$(echo $line | awk -F"[;<>]" '{print $4}')
# 删除前后多余的空格，取决于是否对格式有严格要求，这四行也许是多余的
time=$(echo $time)
name=$(echo $name)
email=$(echo $email)
software=$(echo $software)
#输出
echo -n "Develop time: $time; "
echo -n "Developer: $name; "
echo -n "email:<$email>; "
echo "Product:$software"
# 如果要保留那些无关信息，则将以下两行的"#"符号删除
#else
# echo $line
fi
done

复制代码

用法： ./test.sh file1 > file2

chunchengfh · 发表于 2007-3-3 19:47:27

发完上帖才发现是在python版...
这个shell版先放这了，什么时候学一下python的re，再来修改！

huan · 发表于 2007-3-3 20:16:55

正则还是要用Perl.......

[php]
[0 No.2081 huan@huan ~/tmp]$ cat foo
12 OCT 2006; Jcake willer <jacke_willer@163.com> libdeiododng-1.2.3.-r1.ebuild
added new information in this ebuild.
20 DEV 2006; Miachel Jordan <miachel_jordan@gmail.com> app-pda-2.1.10-r1.ebuild
modified some information in this ebuild.

[0 No.2082 huan@huan ~/tmp]$ perl -lne 'if ( /(^.*); ([^<]+) <(\S+)> (\S+)$/ ) {
print "Develope time: $1; Developer: $2; Email:<$3>; Product: $4" }' < foo
Develope time: 12 OCT 2006; Developer: Jcake willer; Email:<jacke_willer@163.com>; Product: libdeiododng-1.2.3.-r1.ebuild
Develope time: 20 DEV 2006; Developer: Miachel Jordan; Email:<miachel_jordan@gmail.com>; Product: app-pda-2.1.10-r1.ebuild

[0 No.2083 huan@huan ~/tmp]$

[/php]

huan · 发表于 2007-3-3 20:18:33

不过，Develope 这词不对吧，呵

chunchengfh · 发表于 2007-3-3 21:00:02

受huan的启发，这是sed版本： *_*

sed -n 's/$[0-9]* [A-Z]* [0-9]*;$$.*$$<.*@.*\..*>$$.*$/Develop time:\1 Developer:\2 Email:\3 Product: \4/p' file1 > file2

复制代码

如果要保留无关信息，则将 sed -n 's/..../p' 改为 sed 's/..../' 即可。

zhy2111314 · 发表于 2007-3-3 23:19:34

感觉 sed 处理是比较简洁的。

		自动登录	找回密码
密码			注册

请教如何处理字符串?

浏览过的版块