C语言的 strtok 和 strsep 函数的使用

=Start=

正文：

参考解答：

函数原型：
char *strtok(char *s, const char *delim);
char *strsep(char **s, const char *delim);

功能：strtok和strsep两个函数的功能都是用来分解字符串为一组字符串。s为要分解的字符串，delim为分隔符字符串。

返回值：从s开头开始的一个个子串，当没有分割的子串时返回NULL。

相同点：两者都会改变源字符串，想要避免，可以使用strdupa（由allocate函数实现）或strdup（由malloc函数实现）。

不同点：strtok函数第一次调用时会把s字符串中所有在delim中出现的字符替换为NULL。然后通过依次调用strtok(NULL, delim)得到各部分子串。strsep函数将返回delim分隔符前面的字符串，s将指向分隔符之后的字符串

测试代码：

#include <stdio.h>
#include <string.h>

int main(void) {
    char s[] = "hello, world! welcome to china!";
    char delim[] = " ,!";
    printf("Original: '%s'\n\n", s);

    char *token;
    for(token = strtok(s, delim); token != NULL; token = strtok(NULL, delim)) {
        printf(token);
        printf("+");
    }
    printf("\n");
    printf("After strsep: '%s'\n", s);
    return 0;
}

/*输出结果为：
Original: 'hello, world! welcome to china!'
hello+world+welcome+china+
After strsep: '%s'
*/

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(void)
{
  char str[] = "acdeabx";
  char *p = NULL;
  char delim[] = "ab";  /* delim是分隔符的集合 */

  printf("the original str is : %s\n\n", str);

  int in = 0;
  p = strtok(str, delim);
  while(p != NULL){
    printf("the character is :%s\n", p);
    printf("the str is : %s\n", str); /* 源字符串str发生了变化 */
    p = strtok(NULL, delim);
  }
}

#include <stdio.h>
#include <string.h>

int main(){
    char str[] = "2,张三,89,99,66";
    // char str[] = ",,2,张三,89,99,66";
    // str是一个指针常量，而strsep的第一个参数需要一个指向指针的指针，所以不能对str做取地址操作，这里再定义一个指针变量就可以取地址操作了。否则会出现段错误。
    char *strv = str;
    char *token = strsep(&strv, ",");
    while(token != NULL){
        printf("%s\t", token);
        token = strsep(&strv, ",");
    }
    printf("\n");
    return 0;
}

#include <stdio.h>
#include <string.h>

int main(void) {
    char source[] = "hello, world! welcome to china!";
    char delim[] = " ,!";
    printf("Original: '%s'\n\n", source);

    char *s = strdup(source);
    char *token;
    for(token = strsep(&s, delim); token != NULL; token = strsep(&s, delim)) {
        printf(token);
        printf("+");
    }
    printf("\n\n");
    printf("After strsep: '%s'\n", s);
    return 0;
}

/*输出结果为：
Original: 'hello, world! welcome to china!'
hello++world++welcome+to+china++
After strsep: '%s'
*/

为什么用strtok时子串中间只有一个”+”，而strsep却有多个”+”呢？文档中有如下的解释：

One difference between strsep and strtok_r is that if the input string contains more than one character from delimiter in a row strsep returns an empty string for each pair of characters from delimiter. This means that a program normally should test for strsep returning an empty string before processing it.

大意是：如果输入的字符串有连续的多个字符属于delim，（此例source中的逗号+空格，感叹号+空格等就是这种情况），strtok会返回NULL，而strsep会返回空串””。因而我们如果想用strsep函数分割字符串必须进行返回值是否是空串的判断。这也就解释了strsep的例子中有多个”+”的原因。

我们在自己的程序中最好尽量避免使用strtok，转而使用strsep。因为：

* – Added strsep() which will replace strtok() soon (because strsep() is reentrant and should be faster). Use only strsep() in new code, please.

即，strsep()相比strtok()来说，它是可重入(reentrant)的（更安全），且速度更快。

参考链接：

=END=

24 10 月, 2016

admin

KnowledgeBase, Linux, Programing

C, Linux, strsep, strtok