诚心求助各位大侠:从字符串中分离出数字和特定位置的字符
诚心求助各位大侠:
文档中的内容如下:
LYS107NZG697N72.91.9
GLU99OE2C687N43.22.2
ARG21NH1G64O63.12.2
......
需要提取的数字和字符均以红颜色标记,现在想 第一:从第1,2,3行中分别提取LYS,GLU,GLU后边的数字107,99,21(注:"GLU99OE2C687N43.22.2"中99后边是字符'O',而不是数字0);第二:从第1,2,3行中分别提取数字697,687,64前边的字符'G','C','G'。感激不尽!诚求各位帮助!
[解决办法]
正则表达式即可,python写的,比较快
line = "LYS107NZG697N72.91.9"
pattern = re.compile(r"^[A-Z]+(\d+)(\w+)([A-Z]\d+\.\d+\.\d+)")
match = pattern.match(line)
print match.groups()
#('107', 'NZG697', 'N72.91.9')
sub_pattern = re.compile(r'\w+[A-Z](\d+)')
sub_match = sub_pattern.match(match.groups()[1])
print sub_match.groups()
#('697',)
match1 = pattern.match("GLU99OE2C687N43.22.2")
print match1.groups()
#('99', 'OE2C687', 'N43.22.2')
sub_match1 = sub_pattern.match(match1.groups()[1])
print sub_match1.groups()
#('687',)
sub_pattern = re.compile(r'\w+([A-Z])\d+')
#include <stdio.h>
char s[]="123 ab 4";
char *p;
int v,n,k;
void main() {
p=s;
while (1) {
k=sscanf(p,"%d%n",&v,&n);
printf("k,v,n=%d,%d,%d\n",k,v,n);
if (1==k) {
p+=n;
} else if (0==k) {
printf("skip char[%c]\n",p[0]);
p++;
} else {//EOF==k
break;
}
}
printf("End.\n");
}
//k,v,n=1,123,3
//k,v,n=0,123,3
//skip char[ ]
//k,v,n=0,123,3
//skip char[a]
//k,v,n=0,123,3
//skip char[b]
//k,v,n=1,4,2
//k,v,n=-1,4,2
//End.