A scalable clustering system is described. In an embodiment the clustering system is operable for extremely large scale applications where millions of items having tens of millions of features are clustered. In an embodiment the clustering system uses a probabilistic cluster model which models uncertainty in the data set where the data set may be for example, advertisements which are subscribed to keywords, text documents containing text keywords, images having associated features or other items. In an embodiment the clustering system is used to generate additional features for associating with a given item. For example, additional keywords are suggested which an advertiser may like to subscribe to. The additional features that are generated have associated probability values which may be used to rank those features in some embodiments. User feedback about the generated features is received and used to revise the feature generation process in some examples.