To iterate over code points and strings, use a loop like this: 39 *
40 * UnicodeSetIterator it(set); 41 * while (it.next()) { 42 * processItem(it.getString()); 43 * } 44 *
Each item in the set is accessed as a string. Set elements 46 * consisting of single code points are returned as strings containing 47 * just the one code point. 48 * 49 *
To iterate over code point ranges, instead of individual code points, 50 * use a loop like this: 51 *
52 * UnicodeSetIterator it(set); 53 * while (it.nextRange()) { 54 * if (it.isString()) { 55 * processString(it.getString()); 56 * } else { 57 * processCodepointRange(it.getCodepoint(), it.getCodepointEnd()); 58 * } 59 * } 60 *
skipToStrings()
190 * If isString() == true, the value is a 191 * string, otherwise the value is a 192 * single code point. Elements of either type can be retrieved 193 * with the function getString(), while elements of 194 * consisting of a single code point can be retrieved with 195 * getCodepoint() 196 * 197 *
The order of iteration is all code points in sorted order, 198 * followed by all strings sorted order. Do not mix 199 * calls to next() and nextRange() without 200 * calling reset() between them. The results of doing so 201 * are undefined. 202 * 203 * @return true if there was another element in the set. 204 * @stable ICU 2.4 205 */ 206 UBool next(); 207 208 /** 209 * Returns the next element in the set, either a code point range 210 * or a string. If there are no more elements in the set, return 211 * false. If isString() == true, the value is a 212 * string and can be accessed with getString(). Otherwise the value is a 213 * range of one or more code points from getCodepoint() to 214 * getCodepointeEnd() inclusive. 215 * 216 *
The order of iteration is all code points ranges in sorted 217 * order, followed by all strings sorted order. Ranges are 218 * disjoint and non-contiguous. The value returned from getString() 219 * is undefined unless isString() == true. Do not mix calls to 220 * next() and nextRange() without calling 221 * reset() between them. The results of doing so are 222 * undefined. 223 * 224 * @return true if there was another element in the set. 225 * @stable ICU 2.4 226 */ 227 UBool nextRange(); 228 229 /** 230 * Sets this iterator to visit the elements of the given set and 231 * resets it to the start of that set. The iterator is valid only 232 * so long as set is valid. 233 * @param set the set to iterate over. 234 * @stable ICU 2.4 235 */ 236 void reset(const UnicodeSet& set); 237 238 /** 239 * Resets this iterator to the start of the set. 240 * @stable ICU 2.4 241 */ 242 void reset(); 243 244 /** 245 * ICU "poor man's RTTI", returns a UClassID for this class. 246 * 247 * @stable ICU 2.4 248 */ 249 static UClassID U_EXPORT2 getStaticClassID(); 250 251 /** 252 * ICU "poor man's RTTI", returns a UClassID for the actual class. 253 * 254 * @stable ICU 2.4 255 */ 256 virtual UClassID getDynamicClassID() const override; 257 258 // ======================= PRIVATES =========================== 259 260 private: 261 262 // endElement and nextElements are really UChar32's, but we keep 263 // them as signed int32_t's so we can do comparisons with 264 // endElement set to -1. Leave them as int32_t's. 265 /** The set 266 */ 267 const UnicodeSet* set; 268 /** End range 269 */ 270 int32_t endRange; 271 /** Range 272 */ 273 int32_t range; 274 /** End element 275 */ 276 int32_t endElement; 277 /** Next element 278 */ 279 int32_t nextElement; 280 /** Next string 281 */ 282 int32_t nextString; 283 /** String count 284 */ 285 int32_t stringCount; 286 287 /** 288 * Points to the string to use when the caller asks for a 289 * string and the current iteration item is a code point, not a string. 290 */ 291 UnicodeString *cpString; 292 293 /** Copy constructor. Disallowed. 294 */ 295 UnicodeSetIterator(const UnicodeSetIterator&) = delete; 296 297 /** Assignment operator. Disallowed. 298 */ 299 UnicodeSetIterator& operator=(const UnicodeSetIterator&) = delete; 300 301 /** Load range 302 */ 303 void loadRange(int32_t range); 304 }; 305 306 inline UBool UnicodeSetIterator::isString() const { 307 return codepoint < 0; 308 } 309 310 inline UChar32 UnicodeSetIterator::getCodepoint() const { 311 return codepoint; 312 } 313 314 inline UChar32 UnicodeSetIterator::getCodepointEnd() const { 315 return codepointEnd; 316 } 317 318 319 U_NAMESPACE_END 320 321 #endif /* U_SHOW_CPLUSPLUS_API */ 322 323 #endif