The urban mobility structure is a summary of individual movement patterns and the interaction between persons and the urban environment, which is extremely important for urban management and public transportation route planning. The majority of current research on urban mobility structure discovery utilizes the urban environment as a static network to detect the relationship between people groups and urban areas, ignoring the vital problem of how individuals affect urban mobility structure dynamically. In this paper, we propose a spatio-temporal representational learning method based on reinforcement learning for discovering urban mobility structures, in which the model can effectively consider the interaction knowledge graph of individuals with stations while accounting for the spatio-temporal heterogeneity of individual travel. The experimental results demonstrate the advantages of individual travel-based urban mobility structure discovery research in describing the interaction between individuals and urban areas, which can account for the intrinsic influence more thoroughly.