nchar                  package:base                  R Documentation

_C_o_u_n_t _t_h_e _N_u_m_b_e_r _o_f _C_h_a_r_a_c_t_e_r_s (_B_y_t_e_s)

_D_e_s_c_r_i_p_t_i_o_n:

     'nchar' takes a character vector as an argument and returns a
     vector whose elements contain the sizes of the corresponding
     elements of 'x'.

_U_s_a_g_e:

     nchar(x, type = "bytes")

_A_r_g_u_m_e_n_t_s:

       x: character vector, or a vector to be coerced to a character
          vector.

    type: character string: partial matching to one of 'c("bytes",
          "chars", "width")'.  See Details.

_D_e_t_a_i_l_s:

     The 'size' of a character string can be measured in one of three
     ways

     '_b_y_t_e_s' The number of bytes needed to store the string (plus in C
          a final terminator which is not counted).

     '_c_h_a_r_s' The number of human-readable characters.

     '_w_i_d_t_h' The number of columns 'cat' will use to print the string
          in a monospaced font.  The same as 'chars' if this cannot be
          calculated.

     These will often be the same, and almost always will be in
     single-byte locales.  There will be differences between the first
     two with multibyte character sequences, e.g. in UTF-8 locales. If
     the byte stream contains embedded 'nul' bytes, 'type = "bytes"'
     looks at all the bytes whereas the other two types look only at
     the string as printed by 'cat', up to the first 'nul' byte.

     The internal equivalent of the default method of 'as.character' is
     performed on 'x' (so there is no method dispatch).  If you want to
     operate on non-vector objects passing them through 'deparse' first
     will be required.

_V_a_l_u_e:

     An integer vector giving the sizes of each element, currently
     always '2' for missing values (for 'NA').

     If an element is invalid in a multi-byte character set such as
     UTF-8, its number of characters and the width will be 'NA'. 
     Otherwise the number of characters will be non-negative, so
     '!is.na(nchar(x, "chars"))' is a test of validity.

     Names, dims and dimnames are copied from the input.

_N_o_t_e:

     This does *not* by default give the number of characters that will
     be used to 'print()' the string.  Use 'encodeString' to find the
     characters used to print the string.

     Embedded 'nul' bytes are included in the byte count (but not the
     final 'nul').

_R_e_f_e_r_e_n_c_e_s:

     Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S
     Language_. Wadsworth & Brooks/Cole.

_S_e_e _A_l_s_o:

     'strwidth' giving width of strings for plotting; 'paste',
     'substr', 'strsplit'

_E_x_a_m_p_l_e_s:

     x <- c("asfef","qwerty","yuiop[","b","stuff.blah.yech")
     nchar(x)
     # 5  6  6  1 15

     nchar(deparse(mean))
     # 18 17

